The data that we will be using in this tutorial is taxi data from New York. This data contains the pick-up and drop-off locations and times, as well as additional trip information. For example, it contains the trip distance, the passenger count, … .
This tutorial will use data from the website of the New York taxi and limousine commission. For convenience, we prepared a single file that we will use in this tutoria. You can download it here.
The CSV file only contains the IDs of the pick-up and drop-off locations. The actual geographical shapes of those locations are available as a gzipped GeoJSON file which you can download here. (Note: this GeoJSON file was converted from the original SHP file which you can find here).
While the data downloads, you can already continue with the next part where you will create the new data set.
Go to the next part: Step 1: Create data set