GEOG 485:
GIS Programming and Software Development

3.2.2 Reading through records

PrintPrint

Now that you know how to traverse the table horizontally, reading the fields that are available, let's examine how to read up and down through the table records.

The arcpy module contains some objects called cursors that allow you to move through records in a table. There have been quite a few changes made to how cursors can be used over the different versions of ArcGIS, though the current best practice has been in place since version 10.1 of ArcGIS Desktop (aka ArcMap) and has carried over to ArcGIS Pro.  If you find yourself working with code that was written for a pre-10.1 version of ArcMap, we suggest you check out this same page in our old ArcMap version of this course (see the content navigation links on the right side of the Drupal site).  We will focus our discussion below on the current best practice in cursor usage, which you should be utilizing for scripts written in this course and in your workplace.  

The arcpy data access module

At version 10.1 of ArcMap, Esri released a new data access module, which offered faster performance along with more robust behavior when crashes or errors were encountered with the cursor.  The module contains different cursor classes for the different operations a developer might want to perform on table records -- one for selecting, or reading, existing records; one for making changes to existing records; and one for adding new records.  We'll discuss the editing cursors later, focusing our discussion now on the cursor used for reading tabular data, the search cursor

As with all geoprocessing classes, the SearchCursor class is documented in the Help system.  (Be sure when searching the Help system that you choose the SearchCursor class found in the Data Access module.  An older, pre-10.1 class is available through the arcpy module and appears in the Help system as an "ArcPy Function.")  

The common workflow for reading data with a search cursor is as follows:

  1. Create the search cursor. This is done through the method arcpy.da.SearchCursor(). This method takes several parameters in which you specify which dataset and, optionally, which specific rows you want to read.
  2. Set up a loop that will iterate through the rows until there are no more available to read.
  3. Within the loop, do something with the values in the current row.

Here's a very simple example of a search cursor that reads through a point dataset of cities and prints the name of each.

# Prints the name of each city in a feature class

import arcpy

featureClass = "C:\\Data\\USA\\USA.gdb\\Cities"

with arcpy.da.SearchCursor(featureClass,("NAME")) as cursor:
    for row in cursor:
        print (row[0])

Important points to note in this example:

  • The cursor is created using a "with" statement. Although the explanation of "with" is somewhat technical, the key thing to understand is that it allows your cursor to exit the dataset gracefully, whether it crashes or completes its work successfully. This is a big issue with cursors, which can sometimes maintain locks on data if they are not exited properly.
  • The "with" statement requires that you indent all the code beneath it. After you create the cursor in your "with" statement, you'll initiate a for loop to run through all the rows in the table. This requires additional indentation.
  • Creating a SearchCursor object requires specifying not just the desired table/feature class, but also the desired fields within it, as a tuple.  Remember that a tuple is a Python data structure much like a list, except it is enclosed in parentheses and its contents cannot be modified.  Supplying this tuple speeds up the work of the cursor because it does not have to deal with the potentially dozens of fields included in your dataset. In the example above, the tuple contains just one field, "NAME".
  • The "with" statement creates a SearchCursor object, and declares that it will be named "cursor" in any subsequent code. 
  • A for loop is used to iterate through the SearchCursor (using the "cursor" name assigned to it).  Just as in a loop through a list, the iterator variable can be assigned a name of your choice (here, it's called "row").
  • Retrieving values out of the rows is done using the index position of the field name in the tuple you submitted when you created the object. Since the above example submits only one item in the tuple, then the index position of "NAME" within that tuple is 0 (remember that we start counting from 0 in Python). 

Here's another example where something more complex is done with the row values. This script finds the average population for counties in a dataset. To find the average, you need to divide the total population by the number of counties. The code below loops through each record and keeps a running total of the population and the number of records counted. Once all the records have been read, only one line of division is necessary to find the average. You can get the sample data for this script here.

# Finds the average population in a counties dataset
import arcpy

featureClass = "C:\\Data\\Pennsylvania\\Counties.shp"
populationField = "POP1990"
nameField = "NAME"

average = 0
totalPopulation = 0
recordsCounted = 0

print("County populations:")

with arcpy.da.SearchCursor(featureClass, (nameField, populationField)) as countiesCursor:
    for row in countiesCursor:
        print (row[0] + ": " + str(row[1]))
        totalPopulation += row[1]

        recordsCounted += 1

average = totalPopulation / recordsCounted

print ("Average population for a county is " + str(average))

Differences between this example and the previous one:

  • The tuple includes multiple fields, with their names having been stored in variables near the top of the script.
  • The SearchCursor object goes by the variable name countiesCursor rather than cursor.

Before moving on, you should note that cursor objects have a couple of methods that you may find helpful in traversing their associated records.  To understand what these methods do, and to better understand cursors in general, it may help to visualize the attribute table with an arrow pointing at the "current row." When a cursor is first created, that arrow is pointing just above the first row in the table. When a cursor is included in a for loop, as in the above examples, each execution of the for statement moves the arrow down one row and assigns that row's values (a tuple) to the row variable.  If the for statement is executed when the arrow is pointing at the last row, there is not another row to advance to and the loop will terminate.  (The row variable will be left holding the last row's values.)

Imagine that you wanted to iterate through the rows of the cursor a second time.  If you were to modify the Cities example above, adding a second loop immediately after the first, you'd see that the second loop would never "get off the ground" because the cursor's internal pointer is still left pointing at the last row.  To deal with this problem, you could just re-create the cursor object.  However, a simpler solution would be to call on the cursor's reset() method.  For example:

cursor.reset()

This will cause the internal pointer (the arrow) to move just above the first row again, enabling you to loop through its rows again.

The other method supported by cursor objects is the next() method, which allows you to retrieve rows without using a for loop.  Returning to the internal pointer concept, a call to the next() method moves the pointer down one row and returns the row's values (again, as a tuple).  For example:

row = cursor.next()

An alternative means of iterating through all rows in a cursor is to use the next() method together with a while loop.  Here is the original Cities example modified to iterate using next() and while:

# Prints the name of each city in a feature class (using next() and while)

import arcpy

featureClass = "C:\\Data\\USA\\USA.gdb\\Cities"

with arcpy.da.SearchCursor(featureClass,("NAME")) as cursor:
    try:
        row = cursor.next()
        while row:
            print (row[0])
            row = cursor.next()
    except StopIteration:
        pass

Points to note in this script:

  • This approach requires invoking next() both prior to the loop and within the loop.
  • Calling the next() method when the internal pointer is pointing at the last row will raise a StopIteration exception.  The example prevents that exception from displaying an ugly error.

You should find that using a for loop is usually the better approach, and in fact, you won't see the next() method even listed in the documentation of arcpy.da.SearchCursor.  We're pointing out the existence of this method because a) older ArcGIS versions have cursors that can/must be traversed in this way, so you may encounter this coding pattern if looking at older scripts, and b) you may want to use the next() method if you're in a situation where you know the cursor will contain exactly one row (or you're interested in only the first row).

In each of the cursor examples discussed thus far, the cursors retrieved all rows from the specified feature class.  Continue reading to see how to retrieve just a subset of rows meeting certain criteria.