The next step after configuring the properties of your data is to configure the processing settings.
Click on the Configure processing settings button in the top navigation bar
Processing settings can be adjusted afterwards
You can always adjust the processing settings, even when you have already uploaded data or data has already been processed.
However, if data was already processed, it will have to be reprocessed which, depending on the size of the data, can take some time.
Similar to the aggregation interval for numeric properties, the locations and timestamps will also be grouped into bins.
For this data set, the default values are a good choice so nothing has to be changed here.
When working with big data sets, it is important to be able to see overviews and understand the distribution and identify so-called hotspots in your data. For this, the area is divided into a grid of cells and statistics are computed and visualized for these cells (e.g., number of vessels passing through the cell). Each cell has a size of spatial resolution by spatial resolution.
Similar to the aggregation interval, a good choice of the spatial resolution depends on:
The accuracy you need during analysis.
The accuracy of the recorded positions in your
Outdoor GPS data typically comes with an accuracy between 1 and 100 meters.
Indoor Bluetooth location data is more accurate, typically in the 0.5 to 10 meter range.
For this AIS dataset, you can go with the default value of 25 meters. You can however, lower the spatial resolution to for example 1 (meter) to allow very fine-grained analysis up to the individual vessel level.
The temporal resolution determines how many times an asset (e.g., a vessel) is counted in the same spatial cell.
For example, if the AIS data contains 2 recordings for the same vessel that:
Place the vessel in the same spatial cell,
and the time difference between those 2 recordings is smaller than the temporal resolution,
then, the second recording will be excluded from counting.
When that second recorded location would fall into a different spatial cell, it will always be included, independent of the time difference between the two recordings,
Also here a good choice is to use temporal resolution corresponding to the temporal sampling period in the data.
You can request the processing engine for 2 additional data representations which become available when doing visual analysis.
Trajectory data representation: connects for each asset subsequent points and allows to visualize those assets using line drawings.
Realtime data representation: shows the latest reported location of each asset.
More information on these representations is available in this article.
For this tutorial, there is no need to generate either of those.
Some datasets contain records where one or more values are missing.
The platform has 2 strategies to deal with those missing values:
Leave the value missing: the record will be processed with the missing value (except when the missing value is a required property like for example the timestamp). This is the default behavior.
Copy the missing value from a previous record: the platform will try to find an earlier record with the same id where that value is available, and copy that value.
For this tutorial, you can stick to the default behavior.
Go to the next part: Step 4: Upload the csv files