GEOG 485:
GIS Programming and Automation

2.1.4 String manipulation

PrintPrint

You've previously learned how the string variable can contain numbers and letters and represent almost anything. When using Python with ArcGIS, strings can be useful for storing paths to data and printing messages to the user. There are also some geoprocessing tool parameters that you'll need to supply with strings.

Python has some very useful string manipulation abilities. We won't get into all of them in this course, but following are a few techniques that you need to know.

Concatenating strings

To concatenate two strings means to append or add one string on to the end of another. For example, you could concatenate the strings "Python is " and "a scripting language" to make the complete sentence "Python is a scripting language." Since you are adding one string to another, it's intuitive that in Python you can use the + sign to concatenate strings.

You may need to concatenate strings when working with path names. Sometimes it's helpful or required to store one string representing the folder or geodatabase from which you're pulling datasets and a second string representing the dataset itself. You put both together to make a full path.

The following example, modified from one in the ArcGIS Help, demonstrates this concept. Suppose you already have a list of strings representing feature classes that you want to clip. The list is represented by "featureClassList" in this script:

# This script clips all datasets in a folder
import arcpy

inFolder = "c:\\data\\inputShapefiles\\"
resultsFolder = "c:\\data\\results\\"
clipFeature = "c:\\data\\states\\Nebraska.shp"

# List feature classes
arcpy.env.workspace = inFolder
featureClassList = arcpy.ListFeatureClasses()

# Loop through each feature class and clip
for featureClass in featureClassList:
    
    # Make the output path by concatenating strings
    outputPath = resultsFolder + featureClass
    # Clip the feature class
    arcpy.Clip_analysis(featureClass, clipFeature, outputPath)

String concatenation is occurring in this line: outputPath = resultsFolder + featureClass. In longhand, the output folder "c:\\data\\results\\" is getting the feature class name added on the end. If the feature class name were "Roads.shp" the resulting output string would be "c:\\data\\results\\Roads.shp".

The above example shows that string concatenation can be useful in looping. Constructing the output path by using a set workspace or folder name followed by a feature class name from a list gives much more flexibility than trying to create output path strings for each dataset individually. You may not know how many feature classes are in the list or what their names are. You can get around that if you construct the output paths on the fly through string concatenation.

Casting to a string

Sometimes in programming you have a variable of one type that needs to be treated as another type. For example, 5 can be represented as a number or as a string. Python can only perform math on 5 if it is treated as a number, and it can only concatenate 5 onto an existing string if it is treated as a string.

Casting is a way of forcing your program to think of a variable as a different type. Create a new script in PythonWin and type or paste the following code:

x = 0
while x < 10:
    print x
    x += 1

print "You ran the loop " + x + " times."

Now try to run it. The script attempts to concatenate strings with the variable x to print how many times you ran a loop, but it results in an error: "TypeError: cannot concatenate 'str' and 'int' objects." Python doesn't have a problem when you want to print the variable x on its own, but Python cannot mix strings and integer variables in a printed statement. To get the code to work, you have to cast the variable x to a string when you try to print it.

x = 0
while x < 10:
    print x
    x += 1

print "You ran the loop " + str(x) + " times."

You can force Python to think of x as a string by using str(x). Python has other casting functions such as int() and float() that you can use if you need to go from a string to a number. Use int() for integers and float() for decimals.

Readings

It's time to take a break and do some readings from another source. If you are new to Python scripting this will help you see the concepts from a second angle.

Read Zandbergen chapters 4 - 6 (skip the parts of Chapter 5 that you have already read in Lesson 1). This can take a few hours but it will save you hours of time if you make sure you understand this material now.

  • Chapter 4 covers the basics of Python syntax, loops, strings and other things we just learned.
  • Chapter 5 talks about geoprocessing with arcpy. Read the sections we skipped in Lesson 1 - 5.7, 5.8
  • Chapter 6 gives some specific instructions about working with ArcGIS datasets, which will be valuable during this week's assigned project.

If you still don't feel like you understand the material after reading the above chapters, don't re-read it just yet. Try some coding from the Lesson 2 practice exercises and assignments, then come back and re-read if necessary. If you are really struggling with a particular concept, type the examples in the interactive window. Programming is like a sport in the sense that you cannot learn all about it by reading; at some point you have to get up and do it.