GEOG 485:
GIS Programming and Software Development

Project 4: Reconstructing a car's path


In this project, you're working as a geospatial consultant to a company that offers auto racing experiences to the public at Wakefield Park Raceway near Goulburn, New South Wales, Australia.  The company's cars are equipped with a GPS device that records lots of interesting data on the car's movements and they'd like to make their customers' ride data, including a map, available through a web app.  The GPS units export the track data in CSV format.

Your task is to write a script that will turn the readings in the CSV file into a vector dataset that you can place on a map. This will be a polyline dataset showing the path the car followed over the time the data was collected. You are required to use the Python csv module to parse the text and arcpy geometries to write the polylines.

The data for this project were made possible by faculty member and Aussie native James O'Brien, who likes to visit Wakefield Park to indulge his love of racing.

Please carefully read all the following instructions before beginning the project. You are not required to use functions in this project but you can gain over & above points by breaking out repetitive code into functions.


This project has the following deliverables:

  1. Your plan of attack for this programming problem, written in pseudocode in any text editor but as a separate document. This should consist only of short, focused steps describing what you are going to do to solve the problem. This is a separate deliverable from your customary project writeup and the well-documented script.
  2. A Python script that reads the data from the file and creates, from scratch, a polyline shapefile with n polylines, n being the number of laps recorded in the file. Each polyline should represent a single lap around the circuit.  Each polyline should also have a Short field that stores the corresponding lap number from the CSV file. The shapefile should use the WGS 1984 geographic coordinate system.
  3. A short writeup (~300 words) explaining what you learned during this project and which requirements you met, or failed to meet. Also describe any "over and above" efforts here so that the graders can look for them.

Successful delivery of the above requirements is sufficient to earn 90% on the project. The remaining 10% is reserved for efforts that go "over and above" the minimum requirements. This could include (but is not limited to) a batch file that could be used to automate the script, creation of the feature class in a file geodatabase instead of a shapefile, or the breaking out of repetitive code into functions and/or modules.  Other over and above opportunities are described below.


You may already see some immediate challenges in this task:

  • You have not previously created a feature class programmatically. You must find and run ArcGIS geoprocessing tools that will create an empty polyline shapefile with a Short Integer field for storing the lap number. You must also assign the WGS 1984 geographic coordinate system as the spatial reference for this shapefile.
  • Almost all of the lines in the file contain a set of 13 values, corresponding to the column names found in the header.  However, at the end of each lap you will find a line that records the time required to complete the lap. For example:
    # Lap 1: 00:01:24.259
    Your script will need to skip over these lap time lines without choking.  Similarly, the file ends with a line -- # Session End -- that should not break the script.


  • Before you start writing code, write a plan of attack describing the logic your script will use to accomplish this task. Break up the original task into small, focused chunks. You can write this in Word or even Notepad. Your objective is not to write fancy prose, but rather short, terse statements of what your code will do: in other words, pseudocode. 
  • You will have a much easier time with this assignment if you approach it incrementally.  Start by opening the CSV file, reading through the lines, properly handling the various types of lines, and printing the values of interest to the Console.  If you can get that to work, see if you can create a single polyline from the full set of points, ignoring the laps.  If you can get that to work, try to tackle the requirement of creating a separate polyline for each lap, etc.
  • A Python dictionary is an excellent structure for storing a lap number coupled with the lap's list of points. A dictionary is similar to a list, but it stores items in key-value pairs.  In this scenario, we recommend using the lap numbers as the dictionary keys and lists of the points associated with the laps as the values associated with the keys.  You can retrieve any value based on its key, and you can also check whether a key exists using a simple if key in dictionary: check.
  • To create your shapefile programmatically, use the Create Feature Class tool. The ArcGIS Pro Help has several examples of how to use this tool.  Note that the feature class's schema (i.e., its fields) isn't specified as part of the Create Feature Class tool.  You'll need to use a different tool, which can be found through a documentation search or a web search, to add the Lap field.  If you can't figure out the setup of the new feature class programmatically, I suggest you create it manually and work on writing the rest of the script. You can then return to this part at the end, if you have time.
  • In order to get the shapefile in WGS 1984, you'll need to create a spatial reference object that you can assign to the shapefile at the time you create it. I recommend using the arcpy.SpatialReference() method. Be warned that if you do not correctly apply the spatial reference, your polyline precision could be diluted.
  • Remember that polylines can be produced from either an arcpy Array of Point objects or a list of x, y coordinate pairs.
  • If you do things right, your polylines should look like this (the line feature class has been added and symbolized manually showing each lap in a different color):
A vector dataset on a map.

Over and above opportunities

There are numerous opportunities for meeting the project's over and above requirement.  Here is a "package" of ideas that you might consider implementing together to do a better job of meeting the original scenario requirements:

  • Omit the first and last laps from the data you insert into the new shapefile.  The driver is pulling out of and back into the pits during these laps and is not interested in the times associated with them.
  • Record the time needed to drive each lap.  There are a couple of ways to approach this problem: a) working with the lap time strings found after the last point in each lap, parse the string into its constituent parts using string manipulation functions and convert to just seconds, or b) compute the difference between each lap's last point and first point Time values. (The value to record for the lap noted above would be 84.259, in a Float field.)
  • The car position is recorded many times per second, resulting in thousands of "breadcrumbs" in the export file.  The company would like to avoid using all of the position data so as to minimize their disk usage with the cloud provider they're using for the web app.  In developing your script, include every nth point from the input CSV file, where n is defined at the top of the script or through a script tool parameter.  The only exception to this rule is that they'd also like you to include the first and last point of every lap. 
  • Run your script with a Pro tool and programmatically add the new feature class to the current Pro project with a symbology like shown in the figure above.

Moving beyond these ideas, there is a lot of potentially interesting information hidden in the time values associated with the points (which gets lost when constructing lines from the points).  One fairly easy step toward analyzing the original data is to create a script tool that includes not only the option to create the polyline feature class described above, but also a point feature class (including the time, lap, speed, and heading values for each point).  Or if you want a really big challenge, you could divide the track into segments and analyze the path data to find the lap in which the fastest time was recorded within each segment.