GEOG 485:
GIS Programming and Automation

Lesson 3 Practice Exercise B Solution

PrintPrint

Below is one possible solution to Practice Exercise B with comments to explain what is going on. If you find a more efficient way to code a solution, please share it through the discussion forums.

# This script determines the percentage of cities with two park
#  and ride facilities

import arcpy
arcpy.env.overwriteOutput = True

cityBoundaries = "D:\\Data\\Geog485\\Lesson3PracticeExerciseB\\Washington.gdb\\CityBoundaries"
parkAndRide = "D:\\Data\\Geog485\\Lesson3PracticeExerciseB\\Washington.gdb\\ParkAndRide"
parkAndRideField = "HasTwoParkAndRides"   # Name of column for storing the Park & Ride information
cityIDStringField = "CI_FIPS"             # Name of column with city IDs
citiesWithTwoParkAndRides = 0             # Used for counting cities with at least two P & R facilities
numCities = 0                             # Used for counting cities in total

# Make a feature layer of all the park and ride facilities
arcpy.MakeFeatureLayer_management(parkAndRide, "ParkAndRideLayer")

# Make an update cursor and loop through each city
with arcpy.da.UpdateCursor(cityBoundaries, (cityIDStringField, parkAndRideField)) as cityRows:
    for city in cityRows:
        # Create a query string for the current city    
        cityIDString = city[0]
        queryString = '"' + cityIDStringField + '" = ' + "'" + cityIDString + "'"

        # Make a feature layer of just the current city polygon    
        arcpy.MakeFeatureLayer_management(cityBoundaries, "CurrentCityLayer", queryString)

        try:
            # Narrow down the park and ride layer by selecting only the park and rides
            #  in the current city
            arcpy.SelectLayerByLocation_management("ParkAndRideLayer", "CONTAINED_BY", "CurrentCityLayer")

            # Count the number of park and ride facilities selected
            selectedParkAndRideCount = arcpy.GetCount_management("ParkAndRideLayer")
            numSelectedParkAndRide = int(selectedParkAndRideCount.getOutput(0))

            # If more than two park and ride facilities found, update the row to TRUE
            if numSelectedParkAndRide >= 2:
                city[1] = "TRUE"

                # Don't forget to call updateRow
                cityRows.updateRow(city)

                # Add 1 to your tally of cities with two park and rides                
                citiesWithTwoParkAndRides += 1

        finally:
            # Delete current cities layer to prepare for next run of loop
            arcpy.Delete_management("CurrentCityLayer")
            numCities +=1

# Clean up park and ride feature layer
arcpy.Delete_management("ParkAndRideLayer")

# Calculate and report the number of cities with two park and rides
if numCities <> 0:
    percentCitiesWithParkAndRide = ((1.0 * citiesWithTwoParkAndRides) / numCities) * 100
else:
    print "Error with input dataset. No cities found."

print str(percentCitiesWithParkAndRide) + " percent of cities have two park and rides."

The video below offers some line-by-line commentary on the structure of the above solution:

Click for a transcript of "3B" video.

PRESENTER: This video shows one possible solution to Lesson 3, Practice Exercise B, which calculates the number of cities in the data set with at least two park and ride facilities using selections in ArcGIS. It also uses an update cursor to update a field indicating whether the city has at least two park and rides. The beginning of this script is very similar to Practice Exercise A, so I won't spend a lot of time on that. Lines 4 through 8 are pretty familiar from that exercise.

In lines 9 and 10, I'm setting up variables to reference some fields that I'm going to use. Part of the exercise is to update a field called HasTwoParkAndRides to be either true or false. And so I'm setting a variable for that.

And in line 10, I set up a variable for the city FIPS code. We're going to be querying each city one by one, and we need some field with unique values for each city. We might use the city name, but that could be dangerous in the case that you have a data set where two cities happen to have the same name. So we'll use the FIPS code instead.

In lines 11 and 12, I'm setting up some counters that will be used to count the number of cities we run into that have at least two park and rides, and then the number of cities total respectively. In line 15, I'm starting out by making a feature layer of just the park and ride facilities. So this is familiar from the previous exercise, where the first parameter I pass in is the path to the park and ride data set and then the second parameter is a name that I give this feature layer. Just for the purposes of this script, I'll call it ParkAndRideLayer.

And in line 18, I'm opening an update cursor on all of the city boundaries. So I'm going to loop through each city here and select the number of facilities that occur in each city and count those up. And so I'm going to be working with two fields using this cursor. And I pass those in in a tuple. That's cityIDStringField and parkAndRideField. And those are the variables that I set up earlier in the script in lines 10 and 11.

It might be helpful at this point to take a look at what we're doing conceptually in ArcMap. So the first thing we're going to do is for each city make an attribute query for that city FIPS. And so for example if I were to do this for the city of Seattle, I've set up this. The city FIPS code is this long number here.

And if I were to query that, I have one city. So I'm going to narrow down my cities feature layer to just have one thing in it. And then I'm going to do the selection by location where I select park and ride facilities that are completely within the layer of city boundary. So doing that, I would get these points back. And then I could count those or I could start looping through them and see if I hit at least two in order to determine if the city has at least two park and ride facilities.

Then I would go on to the next city and the next city and so on until I had worked my way through the entire data set. So there is some processing that will occur as you run the script. It might take a few seconds to run.

All right. So we are back on line 18 where we set up the update cursor. We call it cityRows. And then we can loop through each city in this. So that's what is done in line 19 with a for loop.

In line 21 we get the FIPS code for the city that's currently being looked at by the cursor. We use 0 there because that's the index position of the CityIDStringField that we put in the tuple in line 18. It was the first thing in the tuple, so it has index position 0.

And in line 22 we're setting up a string to query for that city FIPS. And so we need the field name equals the city name. And the field name has to be in double quotes. The city name has to be in single quotes. So that involves a lot of string concatenation to get that just right as we set this up.

But once we have that string, we can make a feature layer that just has that one city within it. The way we do that is in line 25. And we call it MakeFeatureLayer. But instead of two parameters, we put in three.

The first two parameters are the same as when we made the feature layer for the park and rides. So we put in-- it's the same pattern. We put in the path to the data set. We put in a name for that feature layer. We're going to call this CurrentCityLayer.

But then the third parameter is that query string that we just created in line 22. And that causes our feature layer to just have the one city in it that's returned by that query.

And so now we're going to do the locational selection in line 30 to select just the park and rides that are contained by that city. The parameters here should be somewhat intuitive if you have that conceptual understanding of how the selection is working, where just one city boundary is selecting a subset of park and rides.

At that point you can call a GetCount tool and figure out how many park and rides were indeed selected. So that's what's going on in line 33. And remember with line 34 that you've got to call getOutput in order to get the result of the GetCount tool. You got exposed to that in the solution to Practice Exercise A.

So in the end what you have in line 34 is this variable called numSelectedParkAndRide. That's the number of park and ride facilities that fell within that city.

And in line 37 you can do a check to see if that number is greater than or equal to two. If it is, then we're going to set the value to true. So that's where we, using our cursor, go to the first item in the tuple up here, actually it's the second item, but it has index position 1, and we set that equal to true. And we don't want to forget to call updateRow so that that gets applied to the data set.

And then in the end we can update our counter of cities that have at least two park and rides.

Finally, after we're done doing all this looping, we delete the feature layer. We clean it up. And we increment the number of cities we've counted by one here. And so we do that for every single city in the data set.

And then we clean up the park and ride layer in line 52. So notice there are some multiple levels of indentation here showing the if-then's the try and finally's and the loop that runs the cursor. And so we need to manage that carefully. If we go back and stop indenting, that means we're going out of the block of code, either out of the if statement or out of the try-catch loop and so on.

So after doing the cleanup, now we're going to calculate the percentage of cities that have at least two park and rides. Now, you can't divide by zero, so we're doing a check in line 55 for that. But if everything checks out OK, which it would unless we had a major problem with our data set, we go ahead in line 56 and we run some division to divide the number of cities with two park and rides by the total number of cities. In the video for Exercise A I explain why we multiply by 1.0, so we can do some decimal division. And then we multiply by 100 to get a nice percentage figure.

In line 57 and 58 we have a little bit of error handling which would occur if the input data set was invalid and there were no cities in it. Otherwise we're going to go down to line 60 and print the percentage of cities with at least two park and rides. So give it a try.