Different types of data sets

The platform supports 4 types of data sets:

  • Movement data: Data from tracking moving assets where both the location and properties can vary over time.

  • Points data: Data from independent, unrelated measurements, each with its own location.

  • Time series data: Data with time varying properties measured or computed for fixed locations.

  • Static data: Data defined for fixed locations with propeties that do not change over time.

Type 1: Movement data

Data of this type is obtained by tracking a moving asset over time. The asset could be anything: cars, boats, airplanes, people, …​ .

Figure 1. Tracking and recording the locations and speeds of moving cars

For each asset, at certain time intervals, the location and additional properties are recorded and stored.

Examples of movement data

Examples of this are:

  • AIS data, which contains the position and properties of vessels over time

  • Similar datasets exist for airplanes, cars

  • Tracking the location of people by tracking their mobile phones

Files the platform expects for movement data

Data sets of this type use the following files:

  • CSV or Parquet data files where each line contains:

    • The unique id of the asset

    • The location of the asset

    • The timestamp at which the location was recorded

    • The values of any additional properties at the specified time (for example the speed or the heading)

  • Optionally, CSV or Parquet metadata files for information that doesn’t change over time. Each line contains:

    • The unique id of the asset

    • The properties that don’t vary over time (for example color, brand)

See the data versus metadata article for more information on the differences between the two.

Type 2: Points data

Data of this type is obtained by collecting individual, independent measurements, each at its own location.

Each point at least has a location, and optional other properties.

Examples of points data

Examples of this are:

  • Events data, with each record representing an independent event occurring at a given location and time. E.g.: vehicle accident events data, harsh breaking data, …​

  • Static point location data. E.g.: the location of all houses in a municipality, each with its number of residents.

Files the platform expects for points data

Data sets of this type use the following files:

  • CSV or Parquet data files where each line contains:

    • The location of the point where the measurement applies to or where the event takes place

    • (Optional) The timestamp at which the measurement/event was recorded

    • (Optional) The values of any additional properties.

Note: if an identifier and timestamp are present, and records share the same identifier, you should probably consider using movement (if the locations for the same identifier differ) or time series (if the location remains constant for each identifier) data instead.

Metadata is not supported for points data: as the records in point data are unrelated, there is no use case for metadata for this type of data.

Type 3: Time series data

Data of this type is obtained by measuring (or computing) values in a fixed location or over a fixed area.

Figure 2. Measuring the temperature at regular time intervals in a fixed location

Those measurements can be taken at a certain point location, or represent a measurement over a certain area.

Examples of time series data

Examples of this are:

  • Average temperature measurement (for a computer, a room, a country, …​).

  • A traffic counter keeping track of how many vehicles are on a road segment.

  • A person counter measuring how many people are in a room at all times.

Files the platform expects for time series data

Data sets of this type use the following files:

  • GeoJSON files which define the shapes and constant properties for the areas where the time varying data is defined for. Each feature in the GeoJSON file represents a single measurement area, and defines:

    • A geometry, representing the coverage area of the measurement.

      This geometry is used during visualization of the data on the spatial map of the visual analytics page.

    • A unique identifier for the sensor or device taking the measurements, or for the covered area.

    • Additional properties about the measurement device or area (for example the brand or type of the device, or the name of the area).

  • CSV or Parquet data files where each line contains:

    • The unique measurement device or area identifier to indicate to which device or area the recordings of that specific row belong.

    • The timestamp of when the recording took place.

    • Values for each measurement that was taken (for example the temperature).

Type 4: Static GeoJSON data

Data of this type are regular GeoJSON files, containing:

  • Locations (areas, lines or points)

  • Properties for each of those locations

The benefit of creating data sets for these GeoJSON files instead of creating a background or area of interest layer for it are that you can use the properties in the data to style and filter the GeoJSON.

Files the platform expects for static GeoJSON data

Data sets of this type use the following files:

  • GeoJSON files which define the shapes and properties for those shapes. Each feature in the GeoJSON file represents a single shape, and defines:

    • A geometry

      This geometry is used during visualization of the data on the spatial map of the visual analytics page.

    • A unique identifier for the location

    • Additional (static) properties about the location.