Data ingestion overview
Nominal provides flexible data ingestion capabilities to support a variety of data types in the platform.
You can create datasets from files, upload videos, and create connections to external databases and storage systems.
Data can be ingested into Nominal via the web UI, the Python client, or the API.
Ingestion methods
Batch
Nominal supports batch file ingestion for a variety of different data types, including timeseries data, location, and video.
Timeseries data
Nominal ingests timeseries data into a dataset. A dataset can contain one or more files, and you can append data to an existing dataset.
We support a variety of different file formats, including:
- CSV
- Chapter 10
- Dataflash
- MCAP
- Parquet
- TDMS
When configuring a file ingestion, you’ll need to specify the timestamp column and timestamp format for your data.
Video
Nominal currently supports H.264-encoded videos using the YUV420p color format.
Video timestamps can be provided via a user-defined start time, encoded in the video itself (e.g. an MCAP video), or on a per-frame basis.
Streaming
Data can be streamed into Nominal via connections. Streaming connections accept JSON- and protobuf-encoded metrics following the Nominal metrics schema, as well as Telegraf and Prometheus metrics. See the channel writer API for more information.
To create a new streaming connection, first create a new Nominal data source via the create Nominal data source endpoint, then create a new connection referencing the newly-created nominalDataSourceRid
via the create connection endpoint.
Integrations
Nominal integrates with third-party storage systems and databases through connections, which can be created via the create connection endpoint. Connections are scraped hourly and new data is automatically ingested into Nominal. When configuring a connection, you can specify which tags are used to disambiguate data between runs via the requiredTagNames
parameter (see runs for more information).
InfluxDB
Connecting to InfluxDB requires a host, port, and credentials.
Timestream
Connecting to Amazon Timestream requires a role ARN, role region, and database name.
Timescale
Connecting to Timescale requires you to provide a host, port, username, password, and database name. You’ll also need to configure the timestamp column, channel name column, and value column to scrape.
BigQuery
BigQuery connections correspond to individual tables. When connecting to BigQuery, you’ll configure the region, project, database, table name, along with the credentials to authenticate with Google Cloud. You’ll also need to configure the timestamp column and tag columns to scrape.
Visual Crossing
Nominal integrates with Visual Crossing to provide access to meteorological data. See the weather data documentation for more details. Please reach out to a member of the Nominal team to configure your Visual Crossing connection.
Cloud storage
Nominal can connect to Amazon S3 and Google Cloud Storage. Please reach out to a member of the Nominal team to configure your cloud storage connection. Once completed, you can trigger ingests for this connection through the API. Note that unlike other connections, cloud storage isn’t scraped automatically and ingests must be triggered via the API.
Timestamp formats
During ingestion, you’ll provide metadata which includes the timestamp format. Nominal supports a variety of timestamp formats, including absolute and relative timestamps (measured as an offset from a user-defined start time).
Absolute timestamps
An absolute timestamp specifies an exact point in time. It includes a date, time, and timezone. Nominal can parse unix timestamps, ISO 8601 timestamps, and IRIG time codes, as well as user-defined timestamp formats.
Relative timestamps
A relative timestamp measures time as an offset from a user-defined start time.
For example, if you set your start time as “2024-02-13 09:00:00 UTC” and your offset is measured in seconds, then: