Welcome to Geography 485. Over the next ten weeks you'll work through four lessons and a final project dealing with ArcGIS automation in Python. Each lesson will contain readings, examples, and projects. Since the lessons are two weeks long, you should plan between 20 - 24 hours of work to complete them, although this number may vary depending on your prior programming experience. See the Course Schedule section of this syllabus, below, for a schedule of the lessons and course projects.
As with GEOG 483 and GEOG 484, the lessons in this course are project-based with key concepts embedded within. However, because of the nature of computer programming, there is no way this course can follow the step-by-step instruction design of the previous courses. You will probably find the course to be more challenging than the others. For that reason, it is more important than ever that you stay on schedule and take advantage of the course message boards and private e-mail. It's quite likely that you will get stuck somewhere during the course, so before getting hopelessly frustrated, please seek help from me or your classmates!
I hope that by now that you have reviewed our Orientation and Syllabus for an important course site overview. Before we begin our first project, let me share some important information about the textbook and a related Esri course.
The textbook for this course is Python Scripting for ArcGIS by Paul A. Zandbergen. This book came out in 2012 and has been a hot item among Esri software users; I suggest you order your copy immediately in case of shortages or delays.
Back when Geog 485 was rewritten as a Python course, there was no textbook available that tied together ArcGIS and Python scripting. As you read through Zandbergen's book, you'll see material that closely parallels what is in the Geog 485 lessons. This isn't necessarily a bad thing; when you are learning a subject like programming, it can be helpful to have the same concept explained from two angles.
My advice about the readings is this: Read the material on the Geog 485 lesson pages first. If you feel like you have a good understanding from the lesson pages, you can skim through some of the more lengthy Zandbergen readings. If you struggled with understanding the lesson pages, you should pay close attention to the Zandbergen readings and try some of the related code snippets and exercises. I suggest you plan about 1 - 2 hours per week of reading if you are going to study the chapters in detail.
In all cases, you should get a copy of the textbook because it is a relevant and helpful reference.
There is a free Esri Virtual Campus course, Python for Everyone, that introduces a lot of the same things you'll learn this term in Geog 485. The course consists of a series of short videos and exercises including some which might help towards the projects. If you want to get a head start, or you feel you want some reinforcement of what we're learning from a different point of view, it would be worth your time to complete this Virtual Campus course.
All you need in order to access this course is an Esri Global Account, which you can create for free. You do not need to obtain an access code from Penn State.
The course moves through ideas very quickly and covers a range of concepts that we'll spend 10 weeks studying in depth, so don't worry if you don't understand it all immediately or if it seems overwhelming. You might find it helpful to quickly review the course again near the end of Geog 485 to review what you've learned.
If you have any questions now or at any point during this week, please feel free to post them to the Lesson 1 Discussion Forum. (To access the forums, return to Canvas via the Canvas link. Once in Canvas, you can navigate to the Modules tab and then scroll to the Lesson 1 Discussion Forum.) While you are there, feel free to post your own responses if you, too, are able to help a classmate.
Now, let's begin Lesson 1.
This lesson is two weeks in length. (See the Calendar in Canvas for specific due dates.) To finish this lesson, you must complete the actvities listed below. You may find it useful to print this page so that you can follow along with the directions.
Do items 1 - 3 (including any of the practice exercises you want to attempt) during the first week of the lesson. You will need the second week to concentrate on the project and quiz.
By the end of this lesson you should:
A geographic information system (GIS) can manipulate and analyze spatial datasets with the purpose of solving geographic problems. GIS analysts perform all kinds of operations on data to make it useful for solving a focused problem. This includes clipping, reprojecting, buffering, merging, mosaicking, extracting subsets of the data, and hundreds of other operations. In the ArcGIS software used in this course, these operations are known as geoprocessing and they are performed using tools.
Successful GIS analysis requires selecting the most appropriate tools to operate on your data. ArcGIS uses a toolbox metaphor to organize its suite of tools. You pick the tools you need and run them in the proper order to make your finished product.
Suppose you’re responsible for selecting sites for a chain restaurant. You might use one tool to select land parcels along a major thoroughfare, another tool to select parcels no smaller than 0.25 acres, and other tools for other selection criteria. If this selection process were limited to a small area, it would probably make sense to perform the work manually.
However, let’s suppose you’re responsible for carrying out the same analysis for several areas around the country. Because this scenario involves running the same sequence of tools for several areas, it is one that lends itself well to automation. There are several major benefits to automating tasks like this:
ArcGIS provides three ways for users to automate their geoprocessing tasks. These three options differ in the amount of skill required to produce the automated solution and in the range of scenarios that each can address.
The first option is to construct a model using Model Builder. Model Builder is an interactive program that allows the user to “chain” tools together, using the output of one tool as input in another. Perhaps the most attractive feature of Model Builder is that users can automate rather complex GIS workflows without the need for programming. You will learn how to use Model Builder early in this course.
Some automation tasks require greater flexibility than is offered by Model Builder, and for these scenarios it's recommended that you write short computer programs, or scripts. The bulk of this course is concerned with script writing.
A script typically executes some sequential procedure of steps. Within a script, you can run GIS tools individually or chain them together. You can insert conditional logic in your script to handle cases where different tools should be run depending on the output of the previous operation. You can also include iteration, or loops, in a script to repeat a single action as many times as needed to accomplish a task.
There are special scripting languages for writing scripts, including Python, JScript, and Perl. Often these languages have more basic syntax and are easier to learn than other languages such as C, Java, or Visual Basic.
Although ArcGIS supports various scripting languages for working with its tools, Esri emphasizes Python in its documentation and includes Python with the ArcGIS installation. In this course we’ll be working strictly with Python for this reason, as well as the fact that Python can be used for many other file and data manipulation tasks outside of ArcGIS. You’ll learn the basics of the Python language, how to write a script, and how to manipulate and analyze GIS data using scripts. Finally, you’ll apply your new Python knowledge to a final project, where you write a script of your choosing that you may be able to apply directly to your work.
A third option available to ArcGIS users looking to automate geoprocessing is to build a solution using ArcObjects, the programming building blocks used by Esri’s own developers to produce the ArcGIS desktop products. With ArcObjects, it is possible to customize the user interface to include specific commands and tools that either go outside the abilities of the out-of-the-box ArcGIS tools or modify them to work in a more focused way. ArcObjects programming and interface customization are outside the scope of this course, but are covered in the GIS Application Development course, GEOG 489. GIS customization with ArcObjects can be an advanced endeavor, and learning a scripting language like Python is a good way to prepare yourself by learning basic programming concepts.
The tools that you run in ModelBuilder and Python actually use ArcObjects "under the hood" to run GIS functions; however, the advantage of Python scripting with ArcGIS is that you don't need to learn all the ArcObjects logic behind the tools. Your job is just to learn the tools and how to run them in the appropriate order to accomplish your task.
This first lesson will introduce you to concepts in both model building and script writing. We’ll start by just getting familiar with how tools run in ArcGIS and how you can use those tools in the ModelBuilder interface. Then, we’ll cover some of the basics of Python and see how the tools can be run within scripts.
The ArcGIS software that you use in this course contains hundreds of tools that you can use to manipulate and analyze GIS data. Back before ArcGIS had a graphical user interface (GUI), people would access these tools by typing commands. Nowadays, you can point and click your way through a whole hierarchy of toolboxes using ArcCatalog or the Catalog window in ArcMap.
Although you may have seen them before, let’s take a quick look at the toolboxes:
Let’s examine a tool. Expand Analysis Tools > Proximity > Buffer, and double-click the Buffer tool to open it.
You've probably seen this tool in past courses, but this time, really pay attention to the components that make up the user interface. Specifically, you’re looking at a dialog with many fields. Each geoprocessing tool has required inputs and outputs. Those are indicated by the green dots. They represent the minimum amount of information you need to supply in order to run a tool. For the Buffer tool, you’re required to supply an input features location (the features that will be buffered) and a buffer distance. You’re also required to indicate an output feature class location (for the new buffered features).
Many tools also have optional parameters. You can modify these if you want, but if you don’t supply them, the tool will still run using default values. For the Buffer tool, optional parameters are the Side Type, End Type, Dissolve Type, and Dissolve Fields. Optional parameters are typically specified after required parameters.
Click the Show Help button in the lower-right corner of the tool (if it says Hide Help then you’re already viewing help). You can now click on any parameter in the dialog to see an explanation of that parameter appear in the right-hand window.
If you’re not sure what a parameter means, this is a good way to learn. For example, with the help still open, click the Side Type input box on the Buffer tool (right where it says "FULL"). The Help explains what the Side Type parameter means and lists the different options: FULL, LEFT, RIGHT, and OUTSIDE_ONLY.
If you need even more help, each tool is more expansively documented in the ArcGIS Desktop Help (with Python examples!). You could go directly to the Buffer tool help by clicking the Tool Help button in the tool dialog box, but in this course you'll often want to get to these help pages without opening the tool itself. Below are the steps for doing so.
You can access ArcGIS geoprocessing tools in several different ways:
We’ll start with the simplest of these cases, running a tool from its GUI, and work our way up to scripting.
Let’s start by opening a tool from the Catalog window and running it using its graphical user interface (GUI).
Examine the first required parameter: Input Features. Click the Browse button and browse to the path of your cities dataset C:\WCGIS\Geog485\Lesson1\us_cities.shp. Notice that once you do this, a path is automatically supplied for the Output Feature Class. The software does this for your convenience only and you can change the path if you want.
A more convenient way to supply the Input Features is to just select the cities map layer from the dropdown menu. This dropdown automatically contains all the layers in your map document. However, in this example we browsed to the path of the data because it’s conceptually similar to how we’ll provide the paths in the command line and scripting environments.
When you work with geoprocessing, you’ll frequently want to use the output of one tool as the input into another tool. For example, suppose you want to find all fire hydrants within 200 meters of a building. You would first buffer the building, then use the output buffer as a spatial constraint for selecting fire hydrants. The output from the Buffer tool would be used as an input to the Select by Location tool.
A set of tools chained together in this way is called a model. Models can be simple, consisting of just a few tools, or complex, consisting of many tools and parameters and occasionally some iterative logic. Whether big or small, the benefit of a model is that it solves a unique geographic problem that cannot be addressed by one of the “out-of-the-box” tools.
In ArcGIS, modeling can be done either through the ModelBuilder graphical user interface (GUI) or through code, using Python. To keep our terms clear, we’ll refer to anything built in ModelBuilder as a “model” and anything built through Python as a “script.” However, it’s important to remember that both things are doing modeling.
ModelBuilder is Esri’s graphical interface for making models. You can drag and drop tools from the Catalog window into the model and “connect” them, specifying the order in which they should run.
Although this is primarily a programming course, we’ll spend some time in ModelBuilder during the first lesson for two reasons:
ModelBuilder is a nice environment for exploring the ArcGIS tools, learning how tool inputs and outputs are used, and visually understanding how GIS modeling works. When you begin using Python, you will not have the same visual assistance to see how the tools you’re using are connected, but you may still want to draw your model on a whiteboard in a similar fashion to what you saw in ModelBuilder.
ModelBuilder can frequently reduce the amount of Python coding that you need to do. If your GIS problem does not require advanced conditional and iterative logic, you may be able to get your work done in ModelBuilder without writing a script. ModelBuilder also allows you to export any model to Python code, so if you get stuck implementing some tools within a script, it may be helpful to make a simple working model in ModelBuilder, then export it to Python to see how ArcGIS would construct the code. (Exporting a complex model is not recommended for beginners due to the verbose amount of code that ModelBuilder tends to create when exporting Python).
Let’s get some practice with ModelBuilder to solve a real scenario. Suppose you are working on a site selection problem where you need to select all areas that fall within 10 miles of a major highway and 10 miles of a major city. The selected area cannot lie in the ocean or outside the United States. Solving the problem requires that you make buffers around both the roads and the cities, intersect the buffers, then clip to the US outline. Instead of manually opening the Buffer tool twice, followed by the Intersect tool, then the Clip tool, you can set this up in ModelBuilder to run as one process.
Click OK to dismiss the Model Properties dialog.
You now have a blank canvas on which you can drag and drop the tools. When creating a model (and when writing Python scripts), it’s best to break your problem into manageable pieces. The simple site selection problem here can be thought of as four steps:
Let’s tackle these items one at a time, starting with buffering the cities.
Click the Buffer tool and drag it onto the ModelBuilder canvas. You’ll see a white rectangular box representing the buffer tool and a white oval representing the output buffers. These are connected with a line, showing that the Buffer tool will always produce an output dataset.
In ModelBuilder, tools are represented with boxes and variables are represented with ovals. Right now, the Buffer tool, at center, is white because you have not yet supplied the required parameters. Once you do this, the tool and the variable will fill in with color.
An important part of working with ModelBuilder is supplying clear labels for all the elements. This way, if you share your model, others can easily understand what will happen when it runs. Supplying clear labels also helps you remember what the model does, especially if you haven’t worked with the model for a while.
In ModelBuilder, right-click the us_cities.shp element (blue oval, at far left) and click Rename. Name this element "US Cities."
Right-click the us_citiesBuffer1.shp element (green oval, at far right) and click Rename. Name this “Buffered cities.” Your model should look like this.
Practice what you just learned by adding another Buffer tool to your model. This time, configure the tool so that it buffers the us_roads shapefile by 10 miles. Remember to set the Dissolve type to ALL and to add meaningful labels. Your model should now look like this.
Rename the output of the Intersect operation "Intersected buffers." If the text runs onto multiple lines, you can click and drag the edges of the element to resize it. You can also rearrange the elements on the page however you like. Because models can get large, ModelBuilder contains several navigation buttons for zooming in and zooming to the full extent of the model. Your model should now look like this:
Set meaningful labels for the remaining tools as shown below. Below is an example of how you can label and arrange the model elements.
When the model has finished running (it may take a while), examine the output in ArcMap. Zoom in to Washington state to verify that the has Clip worked on the coastal areas. The output should look similar to this.
That’s it! You’ve just used ModelBuilder to chain together several tools and solve a GIS problem.
You can double-click this model any time in the Catalog window and run it just as you would a tool. If you do this, you’ll notice that the model has no parameters; you can’t change the buffer distance or input features. The truth is, our model is useful for solving this particular site-selection problem with these particular datasets, but it’s not very flexible. In the next section of the lesson, we’ll make this model more versatile by configuring some of the variables as input and output parameters.
Most tools, models, and scripts that you create with ArcGIS have parameters. Input parameters are values with which the tool (or model or script) starts its work, and output parameters represent what the tool gives you after its work is finished.
A tool, model, or script without parameters is only good in one scenario. Consider the model you just built that used the Buffer, Intersect, and Clip tools. This model was hard-coded to use the us_cities, us_roads, and us_boundaries shapefiles and output a shapefile called suitable_land. In other words, if you wanted to run the model with other datasets, you would have to open ModelBuilder, double-click each element (US Cities, US Roads, US Boundaries, and Suitable land), and change the paths that were written directly into the model. You would have to follow a similar process if you wanted to change the buffer distances, too, since those were hard-coded to 10 miles.
Let’s modify that model to use some parameters, so that you can easily run it with different datasets and buffer distances.
Even though you "parameterized" the cities, your model still defaults to using the C:\WCGIS\Geog485\Lesson1\us_cities.shp dataset. This isn't going to make much sense if you share your model or toolbox with other people because they may not have the same us_cities shapefile, and even if they do, it probably won't be sitting at the same path on their machines.
To remove the default dataset, double-click the Cities element and delete the path, then click OK. Some of the elements in your model may turn white. This signifies that a value has to be provided before the model can successfully run.
Double-click your model Lesson 1 > Find Suitable Land With Parameters and examine the tool dialog. It should look similar to this:
People who run this model will be able to browse to any cities, roads, and boundaries datasets, and will be able to control the buffer distance. The green dots indicate parameters that must be supplied with valid values before the model can run.
The above exercise demonstrated how you can expose values as parameters using ModelBuilder. You need to decide which values you want the user to be able to change and designate those as parameters. When you write Python scripts, you'll also need to identify and expose parameters in a similar way.
By now you've had some practice with ModelBuilder and you're about ready to get started with Python. This page of the lesson contains some optional advanced material that you can read about ModelBuilder. This is particularly helpful if you anticipate using ModelBuilder frequently in your employment. Some of the items are common to the ArcGIS geoprocessing framework, meaning that they also apply when writing Python scripts with ArcGIS.
GIS analysis sometimes gets messy. Most of the tools that you run produce an output dataset, and when you chain many tools together those datasets start piling up on disk. Even if you're diligent about naming your datasets intuitively, it's easy to wind up with a folder full of datasets with names like buffers1, clippedbuffers1, intersectedandclippedbuffers1, raster2reclassified, etc.
In most cases, you are concerned with just the final output dataset. The intermediate data is just temporary; you only need to keep it around for as long as it takes to run the model, and then it can be deleted.
ModelBuilder can manage your intermediate data for you, placing it in a temporary directory called the scratch workspace. By default, the scratch workspace is your operating system's temp directory, but you can configure it to exist in another location.
You can force data to go into the scratch workspace by using the %SCRATCHWORKSPACE% variable in the path. For example: %SCRATCHWORKSPACE%\myOutput.shp
You can also mark any element in ModelBuilder as Intermediate and it will be deleted after the model is run. By default, all derived data is Intermediate.
The following topics from Esri go into more detail on intermediate data and are important to understand as you work with the geoprocessing framework. I suggest reading them once now and returning to them occasionally throughout the course. Some of the concepts in them are easier to understand once you've worked with geoprocessing for a while.
Looping, or iteration, is the act of repeating a process. A main benefit of computers is their ability to quickly repeat tasks that would otherwise be mundane, cumbersome, or error-prone for a human to repeat and record. Looping is a key concept in computer programming and you will use it often as you write Python scripts for this course.
ModelBuilder contains a number of elements called Iterators that can do looping in various ways. The names of these iterators, such as For and While actually mimic the types of looping that you can program in Python and other languages. In this course, we'll focus on learning iteration in Python, which may actually be just as easy as learning how to use a ModelBuilder iterator.
To take a peek at how iteration works in ModelBuilder, you can visit the ArcGIS Desktop help book for model iteration. If you're having trouble understanding looping in later lessons, ModelBuilder might be a good environment to visualize what a loop does. You can come back and visit this book as needed.
Read Zandbergen Chapter 2.1 - 2.9 to reinforce what you learned about geoprocessing and ModelBuilder.
The best way to introduce Python may be to look at a little bit of code. Let’s take the Buffer tool which you recently ran from the ArcToolbox GUI and run it in the ArcGIS Python window. This window allows you to type a simple series of Python commands without writing full permanent scripts. The Python Window is a great way to get a taste of Python.
This time, we’ll make buffers of 15 miles around the cities.
Type the following in the Python window (Don't type the >>>. These are just included to show you where the new lines begin in the Python window.)
>>> import arcpy >>> arcpy.Buffer_analysis("us_cities", "us_cities_buffered", "15 miles", "", "", "ALL")
Zoom in and examine the buffers that were created.
You’ve just run your first bit of Python. You don’t have to understand everything about the code you wrote in this window, but here are a few important things to note.
The first line of the script import arcpy tells the Python interpreter (which was installed when you installed ArcGIS) that you’re going to work with some special scripting functions and tools included with ArcGIS. Without this line of code, Python knows nothing about ArcGIS, so you'll put it at the top of all ArcGIS-related code that you write in this class. You technically don't need this line when you work with the Python window in ArcMap because arcpy is already imported, but I wanted to show you this pattern early; you'll use it in all the scripts you write outside the Python window.
The second line of the script actually runs the tool. You can type arcpy, plus a dot, plus any tool name to run a tool in Python. Notice here that you also put an underscore followed by the name of the toolbox that includes the buffer tool. This is necessary because some tools in different toolboxes actually have the same name (like Clip, which is a tool for clipping vectors in the Analysis toolbox or tool for clipping rasters in the Data Management toolbox).
After you typed arcpy.Buffer_analysis, you typed all the parameters for the tool. Each parameter was separated by a comma, and the whole list of parameters was enclosed in parentheses. Get used to this pattern, since you'll follow it with every tool you run in this course.
In this code, we also supplied some optional parameters, leaving empty quotes where we wanted to take the default values, and truncating the parameter list at the final optional parameter we wanted to set.
How do you know the syntax, or structure, of the parameters to enter? For example, for the buffer distance, should you enter 15MILES, ‘15MILES’, 15 Miles, or ’15 Miles’? The best way to answer questions like these is to return to the Geoprocessing tool reference help topic for the Buffer tool. All of the topics in this reference section have a command line usage and example section to help you understand how to structure the parameters. Optional parameters are enclosed in braces, while the required parameters are not. From the example in this topic, you can see that the buffer distance should be specified as ’15 miles’. Because there is a space in this text, or string, you need to surround it with single quotes.
You might have noticed that the Python window helps you by popping up different options you can type for each parameter. This is called autocompletion, and it can be very helpful if you're trying to run a tool for the first time and you don't know exactly how to type the parameters.
There are a couple of differences between writing code in the Python window and writing code in some other program, such as Notepad or PythonWin. In the Python window, you can reference layers in the map document by their names only, instead of their file paths. Thus, we were able to type "us_cities" instead of something like "C:\\data\\us_cities.shp". We were also able to make up the name of a new layer "us_cities_buffered" and get it added to the map by default after the code ran. If you're going to use your code outside the Python window, make sure you use the full paths.
When you write more complex scripts, it will be helpful to use an integrated development environment (IDE), meaning a program specifically designed to help you write and test Python code. Later in this course we’ll explore the PythonWin IDE.
Earlier in this lesson you saw how tools can be chained together to solve a problem using ModelBuilder. The same can be done in Python, but it’s going to take a little groundwork to get to that point. For this reason we’ll spend the rest of Lesson 1 covering some of the basics of Python.
Take a few minutes to read Zandbergen Chapter 3, a fairly short chapter where he explains the Python window and some things you can do with it.
Python is a language that is used to automate computing tasks through programs called scripts. In the introduction to this lesson, you learned that automation makes work easier, faster, and more accurate. This applies to GIS and many other areas of computer science. Learning Python will make you a more effective GIS analyst, but Python programming is a technical skill that can be beneficial to you even outside the field of GIS.
Python is a good language for beginning programming. Python is a high-level language, meaning you don’t have to understand the “nuts and bolts” of how computers work in order to use it. Python syntax (how the code statements are constructed) is relatively simple to read and understand. Finally, Python requires very little overhead to get a program up and running.
Python is an open-source language and there is no fee to use it or deploy programs with it. Python can run on Windows, Linux, and Unix operating systems.
In ArcGIS, Python can be used for coarse-grained programming, meaning that you can use it to easily run geoprocessing tools such as the Buffer tool that we just worked with. You could code all the buffer logic yourself, using more detailed, fine-grained programming with ArcObjects, but this would be time consuming and unnecessary in most scenarios; it’s easier just to call the Buffer tool from a Python script using one line of code.
In addition to the esri help which describes all of the parameters of a function and how to access them from Python you can also get Python syntax (the structure of the language) for a tool like this :
1. Run the tool interactively (e.g. buffer) with your input data, output data and any other relevant parameters (e.g. distance to buffer)
2. Go to the Geoprocessing -> Results window and right click the completed tool run.
3. Pick "Copy Python Snippet"
4. Paste the code into PythonWin or the Python code window to see how you would code the same operation you just ran in Desktop in Python.
If you installed the student version of ArcGIS, you should already have Python on your computer, typically in a folder called something like C:\Python27\ArcGIS10.2\. You can write Python code at any time in Notepad or other editors and save it as a .py file, but you need to have Python installed in order for your computer to understand and run the program.
In this course we’ll be working with Python version 2.7.x. If you check out the download page for Python from its home page at www.python.org, you’ll see that there are actually higher versions of Python available. Python versions 3 and above contain some major changes that have taken some time for the Python user community and Esri to adopt.
Python comes with a simple default editor called IDLE; however, in this course you’ll use the PythonWin integrated development environment (IDE) to help you write code. PythonWin is free, has basic debugging capabilities, and is included with ArcGIS. The only catch is that it is not installed by default with ArcGIS; you have to do it manually by following the steps below. If you do not have the DVD you can download PythonWin here and execute the downloaded file to start the installation. Please make sure you use the "win32" version for Python 2.7 (pywin32-219.win32-py2.7.exe) not the "amd64" executables which are for installing the 64-bit version of PythonWin, which is only compatible with ArcGIS for Server, not ArcGIS for Desktop. Once you downloaded PythonWin continue with step 6 from the list below.
On Windows Vista or Windows 7, if you see error messages during install, it’s likely that you did not run the install as an Administrator. When you launch the install, make sure you right-click and choose Run as Administrator.
Here’s a brief explanation of the main parts of PythonWin. Before you begin reading, open PythonWin so you can follow along.
When PythonWin opens, you’ll see what’s known as the Interactive Window. You can type a line of Python at the >>> prompt and it will immediately execute and print the result, if there is a printable result. The Interactive Window can be a good place to practice with Python in this course, and whenever you see some Python code next to the >>> prompt in the lesson materials, this means you can type it in the Interactive Window to follow along. In these ways, the Interactive Window is very similar to the Python window in ArcGIS.
To actually write a new script, click File > New and choose Python Script. Notice a blank page opens that looks a whole lot like Notepad. However, the nice thing about this interface is that the code is color-coded and the default font, Courier, is one typically used by programmers. Spacing and indentation, which are important in Python, are also easy to keep track of in this interface.
The Standard toolbar contains tools for loading, running, and saving scripts. This toolbar is visible by default. Notice the Undo / Redo buttons , which can be useful to you as a programmer if you start coding something and realize you’ve gone down the wrong path, or if you delete a line of code and want to get it back. Also notice the Run button , which looks like a little running person. This is a good way to test your scripts without having to double-click the file in Windows Explorer.
The Debugging toolbar contains tools for carefully reviewing your code line-by-line to help you detect errors. This toolbar is visible by clicking View > Toolbars > Debugging. The Debugging toolbar is extremely valuable to you as a programmer and you’ll learn how to use it later in this course. This toolbar is one of the main reasons to use an Integrated Development Environment (IDE) instead of writing your code in a simple text editor like Notepad.
It’s time to get some practice with some beginning programming concepts that will help you write some simple scripts in Python by the end of Lesson 1. We’ll start by looking at variables.
Remember your first introductory algebra class where you learned that a letter could represent any number, like in the statement x + 3? This may have been your first exposure to variables. (Sorry if the memory is traumatic!) In computer science, variables represent values or objects you want the computer to store in its memory for use later in the program.
Variables are frequently used to represent not only numbers, but also text and “Boolean” values (‘true’ or ‘false’). A variable might be used to store input from the program’s user, to store values returned from another program, to represent constant values, and so on.
Variables make your code readable and flexible. If you hard-code your values, meaning that you always use the literal value, your code is useful only in one particular scenario. You could manually change the values in your code to fit a different scenario, but this is tedious and exposes you to greater risk of making a mistake (suppose you forget to change a value). Variables, on the other hand, allow your code to be useful in many scenarios and are easy to parameterize, meaning you can let users change the values to whatever they need.
To see some variables in action, open PythonWin and type this in the Interactive Window:
>>> x = 2
You’ve just created, or declared, a variable, x, and set its value to 2. In some strongly-typed programming languages, such as Java, you would be required to tell the program that you were creating a numerical variable, but Python assumes this when it sees the 2.
When you hit Enter, nothing happens, but the program now has this variable in memory. To prove this, type:
>>> x + 3
You see the answer of this mathematical expression, 5, appear immediately in the Interactive Window, proving that your variable was remembered and used.
You can also use the print command to write the results of operations. We’ll use this a lot when practicing and testing code.
>>>print x + 3 5
Variables can also represent words, or strings, as they are referred to by programmers. Try typing this in the Interactive Window:
>>>myTeam = "Nittany Lions" >>>print myTeam Nittany Lions
In this example, the quotation marks tell Python that you are declaring a string variable. Python is a powerful language for working with strings. A very simple example of string manipulation is to add, or concatenate, two strings, like this:
>>> string1 = "We are " >>> string2 = "Penn State!" >>> print string1 + string2 We are Penn State!
You can include a number in a string variable by putting it in quotes, but you must thereafter treat it like a string; you cannot treat it like a number. For example, this results in an error:
>>>myValue = "3" >>>print myValue + 2
In these examples you’ve seen the use of the = sign to assign the value of the variable. You can always reassign the variable. For example:
>>> x = 5 >>> x = x - 2 >>> print x 3
When naming your variables, the following tips will help you avoid errors.
Make variable names meaningful so that others can easily read your code. This will also help you read your code and avoid making mistakes.
You’ll get plenty of experience working with variables throughout this course and will learn more in future lessons.
Read Zandbergen chapter 4.5 (Variables and naming).
The number and string variables that we worked with above represent data types that are built into Python. Variables can also represent other things, such as GIS datasets, tables, rows, and the geoprocessor that we saw earlier that can run tools. All of these things are objects that you use when you work with ArcGIS in Python.
In Python, everything is an object. All objects have:
One way to understand objects is to compare performing an operation in a procedural language (like FORTRAN) to performing the same operation in an object-oriented language. We'll pretend that we are writing a program to make a peanut butter and jelly sandwich. If we were to write the program in a procedural language, it would flow something like this:
If we were to write the program in an object-oriented language, it might look like this:
In the object-oriented example, the bulk of the steps have been eliminated. The sandwich object "knows how" to build itself, given just a few pieces of information. This is an important feature of object-oriented languages known as encapsulation.
Notice that you can define the properties of the sandwich (like the bread type) and perform methods (remember that these are actions) on the sandwich, such as adding the peanut butter and jelly.
The reason it’s so easy to "make a sandwich" in an object-oriented language is that some programmer, somewhere, already did the work to define what a sandwich is and what you can do with it. He or she did this using a class. A class defines how to create an object, the properties and methods available to that object, how the properties are set and used, and what each method does.
A class may be thought of as a blueprint for creating objects. The blueprint determines what properties and methods an object of that class will have. A common analogy is that of a car factory. A car factory produces thousands of cars of the same model that are all built on the same basic blueprint. In the same way, a class produces objects that have the same predefined properties and methods.
In Python, classes are grouped together into modules. You import modules into your code to tell your program what objects you’ll be working with. You can write modules yourself, but most likely you'll bring them in from other parties or software packages. For example, the first line of most scripts you write in this course will be:
Here you're using the import keyword to tell your script that you’ll be working with the arcpy module, which is provided as part of ArcGIS. After importing this module, you can create objects that leverage ArcGIS in your scripts.
Other modules that you may import in this course are os (allows you to work with the operating system), random (allows for generation of random numbers), csv (allows for reading and writing of spreadsheet files in comma-separated value format), and math (allows you to work with advanced math operations). These modules are included with Python, but they aren't imported by default. A best practice for keeping your scripts fast is to import only the modules that you need for that particular script. For example, although it might not cause any errors in your script, you wouldn't include import arcpy in a script not requiring any ArcGIS functions.
Read Zandbergen chapter 5.8 (Classes) for more information about classes.
Another important feature of object-oriented languages is inheritance. Classes are arranged in a hierarchical relationship such that each class inherits its properties and methods from the class above it in the hierarchy (its parent class or superclass). A class also passes along its properties and methods to the class below it (its child class or subclass). A real-world analogy involves the classification of animal species. As a species, we have many characteristics that are unique to humans. However, we also inherit many characteristics from classes higher in the class hierarchy. We have some characteristics as a result of being vertebrates. We have other characteristics as a result of being mammals. To illustrate the point, think of the ability of humans to run. Our bodies respond to our command to run not because we belong to the "human" class, but because we inherit that trait from some class higher in the class hierarchy.
Back in the programming context, the lesson to be learned is that it pays to know where a class fits into the class hierarchy. Without that piece of information, you will be unaware of all of the operations available to you. This information about inheritance can often be found in informational posters called object model diagrams.
Here's an example of a really an object model diagram for the ArcGIS Python Geoprocessor at 10.x. Take a look at the green(ish) box titled FeatureClass Properties and notice at the middle column, second from the top, it says Dataset Properties. This is because FeatureClass inherits all properties from Dataset. Therefore any properties on a Dataset object, such as Extent or SpatialReference, can also be obtained if you create a FeatureClass object. Apart from all the properties it inherits from Dataset, the FeatureClass has its own specialized properties such as FeatureType and ShapeType (in the top box in the left column).
Every programming language has rules about capitalization, white space, how to set apart lines of code and procedures, and so on. Here are some basic syntax rules to remember for Python:
Let’s look at a few example scripts to see how these rules are applied. The first example script is accompanied with a walkthrough video that explains what happens in each line of the code. You can also review the main points about each script after reading the code.
This first example script reports the spatial reference (coordinate system) of a feature class stored in a geodatabase:
# Opens a feature class from a geodatabase and prints the spatial reference import arcpy featureClass = "C:/Data/USA/USA.gdb/StateBoundaries" # Describe the feature class and get its spatial reference desc = arcpy.Describe(featureClass) spatialRef = desc.spatialReference # Print the spatial reference name print spatialRef.Name
This may look intimidating at first, so let’s go through what’s happening in this script, line by line. Watch this video to get a visual walkthrough of the code. You'll notice a typo in the video / screenshot below desc.SpatialReference should be desc.spatialReference - as per the documentation.
Again, notice that:
The best way to get familiar with a new programming language is to look at example code and practice with it yourself. See if you can modify the script above to report the spatial reference of a feature class on your computer. In my example the feature class is in a file geodatabase; you’ll need to modify the structure of the featureClass path if you are using a shapefile (for example, you'll put .shp at the end of the file name, and you won't have .gdb in your path).
Follow this pattern to try the example:
We'll take a short break and do some reading from another source. If you are new to Python scripting it can be helpful to see the concepts from another point of view.
Here’s another simple script that finds all cells over 3500 meters in an elevation raster and makes a new raster that codes all those cells as 1. Remaining values in the new raster are coded as 0. This type of “map algebra” operation is common in site selection and other GIS scenarios.
Something you may not recognize below is the expression Raster(inRaster). This function just tells ArcGIS that it needs to treat your inRaster variable as a raster dataset so that you can perform map algebra on it. If you didn't do this, the script would treat inRaster as just a literal string of characters (the path) instead of a raster dataset.
# This script uses map algebra to find values in an # elevation raster greater than 3500 (meters). import arcpy from arcpy.sa import * # Specify the input raster inRaster = "C:/Data/Elevation/foxlake" cutoffElevation = 3500 # Check out the Spatial Analyst extension arcpy.CheckOutExtension("Spatial") # Make a map algebra expression and save the resulting raster outRaster = Raster(inRaster) > cutoffElevation outRaster.save("C:/Data/Elevation/foxlake_hi_10") # Check in the Spatial Analyst extension now that you're done arcpy.CheckInExtension("Spatial")
Begin by examining this script and trying to figure out as much as you can based on what you remember from the previous scripts you’ve seen.
The main points to remember on this script are:
Now try to run the script yourself using the FoxLake digital elevation model (DEM) in your Lesson 1 data folder. If it doesn’t work the first time, verify that:
You can experiment with this script using different values in the map algebra expression (try 3000 for example).
Read the sections of Chapter 5 that talk about environment variables and licenses (5.9 & 5.11) which we covered in this part of the lesson.
Think about the previous example where you ran some map algebra on an elevation raster. If you wanted to change the value of your cutoff elevation to 2500 instead of 3500, you had to open the script itself and change the value of the cutoffElevation variable in the code.
This third example is a little different. Instead of hard-coding the values needed for the tool (in other words, literally including the values in the script) we’ll use some user input variables, or parameters. This allows people to try different values in the script without altering the code itself. Just like in ModelBuilder, parameters make your script available to a wider audience.
The simple example below just runs the Buffer tool, but it allows the user to enter the path of the input and output datasets as well as the distance of the buffer. The user-supplied parameters make their way into the script with the arcpy.GetParameterAsText() method.
Examine the script below carefully, but don't try to run it yet. You'll do that in the next part of the lesson.
# This script runs the Buffer tool. The user supplies the input # and output paths, and the buffer distance. import arcpy arcpy.env.overwriteOutput = True try: # Get the input parameters for the Buffer tool inPath = arcpy.GetParameterAsText(0) outPath = arcpy.GetParameterAsText(1) bufferDistance = arcpy.GetParameterAsText(2) # Run the Buffer tool arcpy.Buffer_analysis(inPath, outPath, bufferDistance) # Report a success message arcpy.AddMessage("All done!") except: # Report an error messages arcpy.AddError("Could not complete the buffer") # Report any error messages that the Buffer tool might have generated arcpy.AddMessage(arcpy.GetMessages())
Again, examine the above code line by line and figure out as much as you can about what the code does. If necessary, print the code and write notes next to each line. Here are some of the main points to understand:
Read the section of Chapter 5 that talks about working with tool messages (5.10) for another perspective on handling tool output.
User input variables that you retrieve through GetParameterAsText() make your script very easy to convert into a tool in ArcGIS. A few people know how to alter Python code, a few more can run a Python script and supply user input variables, but almost all ArcGIS users know how to open ArcToolbox and run a tool. To finish off this lesson, we’ll take the previous script and make it into a tool that can easily be run in ArcGIS.
Before you begin this exercise, I strongly recommend that you scan the ArcGIS help topic Adding a script tool. You likely will not understand all the parts of this topic yet, but it will give you some familiarity with script tools that will be helpful during the exercise.
Follow these steps to make a script tool:
This is a very simple example and obviously you could just run the out-of-the-box Buffer tool with similar results. Normally when you create a script tool, it will be backed with a script that runs a combination of tools and applies some logic that makes those tools uniquely useful.
There’s another benefit to this example, though. Notice the simplicity of our script tool dialog compared to the main Buffer tool:
At some point you may need to design a set of tools for beginning GIS users where only the most necessary parameters are exposed. You may also do this to enforce quality control if you know that some of the parameters must always be set to certain defaults and you want to avoid the scenario where a beginning user (or a rogue user) might change the required values. A simple script tool is effective for simplifying the tool dialog in this way.
Read Zandbergen 2.10 - 2.13 to reinforce what you learned during this lesson about scripts and script tools.
Each lesson in this course includes some simple practice exercises with Python. These are not submitted or graded, but they are highly recommended if you are new to programming or if the project initially looks challenging. Lessons 1 and 2 contain shorter exercises, while Lessons 3 and 4 contain longer, more holistic exercises. Each practice exercise has an accompanying solution that you should carefully study.
Remember to choose File > New in PythonWin to create a new script (or click the empty page icon). You can name the scripts something like Practice1, Practice2, etc. To execute a script in PythonWin, click the "running man" icon.
Suppose you're working on a project for the Nebraska Department of Agriculture and you are tasked with making some maps of precipitation in the state. Members of the department want to see which parts of the state were relatively dry and wet in the past year, classified in zones. All you have is a series of weather station readings of cumulative rainfall for 2008 that you've obtained from within Nebraska and surrounding areas. This is a shapefile of points called Precip2008Readings.shp. It is in your Lesson 1 data folder.
Precip2008Readings.shp is a fictional dataset created for this project. The locations do not correspond to actual weather stations. However, the measurements are derived from real 2008 precipitation data created by the PRISM Climate Group at Oregon State University, 2009.
You need to do several tasks in order to get this data ready for mapping:
It's very possible that you'll want to repeat the above process in order to test different IDW interpolation parameters or make similar maps with other datasets (such as next year's precipitation data). Therefore, the above series of tasks is well-suited to ModelBuilder. Your job is to create a model that can complete the above series of steps without you having to manually open four different tools.
Your model should have these (and only these) parameters:
As you build your model, you will need to configure some settings that will not be exposed as parameters. These include the clip feature, which is the state of Nebraska outline Nebraska.shp in your Lesson 1 data folder. There are many other settings such as "Z Value field" and "Input barrier polyline features" (for IDW) or "Reclass field" (for Reclassify) that should not be exposed as parameters. You should just set these values once when you build your model. If you ever ask someone else to run this model, you don't want them to be overwhelmed with choices stemming from every tool in the model; you should just expose the essential things they might want to change.
For this particular model, you should assume that any input dataset will conform to the same schema as your Precip2008Readings.shp feature class. For example, an analyst should be able to submit a similar Precip2009Readings dataset with the same fields, field names, and data types. However, he or she should not expect to provide any feature class with a different set of fields and field names, etc. As you might discover, handling all types of feature class schemas would make your model more complex than we want for this assignment.
When you double-click the model to run it, the interface should look like the following:
Running the model with the exact parameters listed above should result in the following (I have symbolized the zones in ArcMap with different colors to help distinguish them). This is one way you can check your work:
The deliverables for this project are:
Successful delivery of the above requirements is sufficient to earn 90% on the project. The remaining 10% is reserved for efforts that go "over and above" the minimum requirements. This could include (but is not limited to) meaningful labels on and around model elements, analysis of how different input values affect the output, substitution of some other interpolation method instead of IDW (for example Kriging), documentation for your model parameters that appears in the side-panel help, or demonstration of how your model was successfully run on a different input dataset. As a general rule throughout the course, full credit in the "over and above" category requires the implementation of 2-4 different ideas, with more complex ideas earning more credit.
The following tips may help you as you build your model:
The second part of Project 1 will help you get some practice with Python. At the end of Lesson 1, you saw three simple scripting examples; now your task is to write your own script. This script will create vector contour lines from a raster elevation dataset. Don't forget that the ArcGIS Desktop Help can indeed be helpful if you need to figure out the syntax for a particular command.
Earlier in the lesson you were introduced to the Fox Lake DEM in your Lesson 1 data folder. It represents elevation in the Fox Lake Quadrangle, Utah. Write a script that uses the Contour tool in the Spatial Analyst toolbox to create contour lines for the quadrangle. The contour interval should be 25 meters and the base contour should be 0. Remember that the native units of the DEM are meters, so no unit conversions are required.
Running the script should immediately create a shapefile of contour lines on disk.
Follow these guidelines when writing the script:
The deliverables for Project 1, Part II are:
To complete Lesson 1, please zip all your Project 1 deliverables (for parts I and II) into one file and submit them to the Project 1 Drop Box in Canvas. Then take the Lesson 1 Quiz if you haven't taken it already.