GEOG 485:
GIS Programming and Automation

Project 3: Extracting amenities from OpenStreetMap data


In this project you'll use your new skills working with selections and cursors to process some data from a "raw" format into a more specialized group of datasets for a specific mapping purpose. The data from this exercise was derived from OpenStreetMap, a free map of the world where anyone can contribute anything, similar in nature to Wikipedia. Although it's easy to put data into OpenStreetMap (See Lesson 9 of Geog 585 if you're interested in trying it), it can be more difficult to get the data out and filter it down to meet specific needs.

Download the data for this project


In this exercise, suppose you are making a humanitarian-themed map for a non-profit agency in El Salvador. You want to show hospitals, schools, and places of worship, with the intent that these facilities could be utilized for shelter or medical care in times of natural disaster.

You currently have a messy shapefile with all manner of points covering a broad area that have been exported out of OpenStreetMap (OSMpoints.shp). You also have a shapefile of country boundaries in Central America (CentralAmerica.shp). You want to make shapefiles of just the things you're interested in, and just within El Salvador.


Write a script that makes a separate shapefile for each of these types of amenities (schools, hospitals, places of worship) within the boundary of El Salvador. Write this script so that the user can change the country or list of amenity types simply by editing a couple of lines of code at the top of the script, like this:

import arcpy
amenities = ['school','hospital','place_of_worship']
country = 'El Salvador'
. . .

After accomplishing the above, your same script should then open each new shapefile and add a text field named "source". For every record, populate this field with the value "OpenStreetMap" so that future users know where the data came from (This could be valuable in case they later decide to add or merge in other features).

Your result should look something like this if viewed in ArcMap:

Example of Project 3 result
Figure 3.5  Example output from Project 3, viewed in ArcMap.

The above requirements are sufficient for receiving 90% of the credit on this assignment. The remaining 10% is reserved for "Over and above" efforts, such as making an ArcToolbox script tool, or extending the script to handle other amenity types, multiple countries, etc. For these over and above efforts, we prefer that you submit two copies of the script: one with the basic functionality and one with the extended functionality. This will make it more likely that you'll receive the base credit if something fails with your over and above coding.  Note that we now expect to see commenting and error handling in your scripts from Project 3 onward, so you should not consider those to qualify as over and above efforts.


Deliverables for this project are as follows:

  • The source .py file containing your script
  • A short writeup (about 300 words) describing how you approached the project, how you successfully dealt with any roadblocks, and what you learned along the way. You should include which requirements you met, or failed to meet. If you added some of the "over and above" efforts, please point these out so the grader can look for them.

You do not have to create an ArcToolbox script tool for this assignment; you can hard-code the initial parameters. Nevertheless, put all the parameters at the top so they can be easily manipulated by whoever tests the script.

Once you get everything working, creating an ArcToolbox script tool is a good way to achieve the "over and above" credit for this assignment. If you do this, then please zip all supporting files before placing them in the drop box.

Notes about the data

Take a look at the OSMpoints.shp dataset in ArcMap, particularly the attribute table. It contains an "amenity" field that you can use to select the things you're interested in. This field corresponds to the amenity tag in OpenStreetMap. If you take a look at the amenity tag documentation, you will see that the official strings for your amenities of interest are: 'hospital', 'school', and 'place_of_worship'...but you want to be able to write a script that's capable of handling other amenities, too.

The Central America shapefile has a field called "NAME". You can make an attribute selection on this field to select El Salvador, then follow that up with a spatial selection to grab all the amenity points that fall within this country. Finally, narrow down those amenity points to just the ones that satisfy your desired amenity.


As mentioned above, you'll be starting out by making a list of all the things you want to capture:

amenities = ['school','hospital','place_of_worship']

Then later in your code you can loop through this list:

for amenity in amenities:
    amenitySelectionClause = '"amenity" = ' + "'" + amenity + "'"
    # select each amenity and make a new layer

Once you've got the selection made, use the Copy Features tool to save the selected features into a new file, in the same fashion as in the practice exercises.

Take this project one step at a time. It's probably easiest to tackle the extraction-into-shapefile portion first.  Once you have all the new amenity shapefiles created, go through them one by one, use the "Add Field" tool to add the "source" field, followed by an UpdateCursor to loop through all rows and populate this new field with 'OpenStreetMap'.

It might be easiest to get the whole process working with a single amenity, then add the loop for all the amenities later after you have finalized all the other script logic.

For the purposes of this exercise, don't worry about capturing points that fall barely outside the edge of the boundary (eg, points in coastal cities that appear in the ocean). To capture all these in real life, you would just need to obtain a higher resolution boundary file. The code would be the same.

You should be able to complete this exercise in about 50 lines of code (including whitespace and comments). If your code gets much longer than this, you are probably missing an easier way.


In some cases, students have reported that the values in the amenity field contain a leading space (e.g., " school" rather than "school"). If you think you've coded your script correctly, but aren't ending up with the correct output, you should check to see if this strange bug is the culprit.  The easiest fix is to simply insert a space in your query as well.