To warm up a bit, let’s briefly revisit a few Python features that you are already familiar with but for which there exist some forms or details that you may not yet know, starting with the Python “import” command. We are also going to introduce a few Python constructs that you may not have heard about yet on the way.
It is highly recommended that you try out these examples yourself and experiment with them to get a better understanding. The examples work in both Python 2 and Python 3, so you can use any Python installation and IDE that you have on your computer for this. If you are not sure what to use, you can also look ahead at the part of Section 1.5 [1] about getting a Python 3 IDE for ArcGIS Pro, spyder, up and running and then come back to this section here.
The form of the “import” command that you definitely should already know is
import <module name>
e.g.,
import arcpy
What happens here is that the module (either a module from the standard library, a module that is part of another package you installed, or simply another .py file in your project directory) is loaded, unless it has already been loaded before, and the name of the module becomes part of the namespace of the script that contains the import command. As a result, you can now access all variables, functions, or classes defined in the imported module, by writing
<module name>.<variable or function name>
e.g.,
arcpy.Describe(…)
You can also use the import command like this instead:
import arcpy as ap
This form introduces a new alias for the module name, typically to save some typing when the module name is rather long, and instead of writing
arcpy.Describe(…)
, you would now use
ap.Describe(…)
in your code.
Another approach of using “import” is to directly add content of a module (again either variables, functions, or classes) to the namespace of the importing Python script. This is done by using the form "from … import …" as in the following example:
from arcpy import Describe, Point , … ... Describe(…)
The difference is that now you can use the imported names directly in our code without having to use the module name (or an alias) as a prefix as it is done in line 5 of the example code. However, be aware that if you are importing multiple modules, this can easily lead to name conflicts if, for instance, two modules contain functions with the same name. It can also make your code a little more difficult to read since
arcpy.Describe(...)
helps you or another programmer recognize that you’re using something defined in arcpy and not in another library or the main code of your script.
You can also use
from arcpy import *
to import all variable, function and class names from a module into the namespace of your script if you don’t want to list all those you actually need. However, this can increase the likelihood of a name conflict.
Next, let’s quickly revisit loops in Python. There are two kinds of loops in Python, the for-loop and the while-loop. You should know that the for-loop is typically used when the goal is to go through a given set or list of items or do something a certain number of times. In the first case, the for-loop typically looks like this
for item in list: # do something with item
while in the second case, the for-loop is often used together with the range(…) function to determine how often the loop body should be executed:
for i in range(50): # do something 50 times
In contrast, the while-loop has a condition that is checked before each iteration and if the condition becomes False, the loop is terminated and the code execution continues after the loop body. With this knowledge, it should be pretty clear what the following code example does:
import random r = random.randrange(100) # produce random number between 0 and 99 attempts = 1 while r != 11: attempts += 1 r = random.randrange(100) print('This took ' + str(attempts) + ' attempts')
What you may not yet know is that there are two additional commands, break and continue, that can be used in combination with either a for or a while-loop. The break command will automatically terminate the execution of the current loop and continue with the code after it. If the loop is part of a nested loop only the inner loop will be terminated. This means we can rewrite the program from above using a for-loop rather than a while-loop like this:
import random attempts = 0 for i in range(1000): r = random.randrange(100) attempts += 1 if r == 11: break # terminate loop and continue after it print('This took ' + str(attempts) + ' attempts')
When the random number produced in the loop body is 11, the body of the if-statement, so the break command, will be executed and the program execution immediately leaves the loop and continues with the print statement after it. Obviously, this version is not completely identical to the while based version from above because the loop will be executed at most 1000 times here.
If you have experience with programming languages other than Python, you may know that some languages have a "do … while" loop construct where the condition is only tested after each time the loop body has been executed so that the loop body is always executed at least once. Since we first need to create a random number before the condition can be tested, this example would actually be a little bit shorter and clearer using a do-while loop. Python does not have a do-while loop but it can be simulated using a combination of while and break:
import random attempts = 0 while True: r = random.randrange(100) attempts += 1 if r == 11: break print('This took ' + str(attempts) + ' attempts')
A while loop with the condition True will in principle run forever. However, since we have the if-statement with the break, the execution will be terminated as soon as the random number generator rolls an 11. While this code is not shorter than the previous while-based version, we are only creating random numbers in one place, so it can be considered a little bit more clear.
When a continue command is encountered within the body of a loop, the current execution of the loop body is also immediately stopped, but in contrast to the break command, the execution then continues with the next iteration of the loop body. Of course, the next iteration is only started if, in the case of a while-loop, the condition is still true, or in the case of a for-loop, there are still remaining items in the list that we are looping through. The following code goes through a list of numbers and prints out only those numbers that are divisible by 3 (without remainder).
l = [3,7,99,54,3,11,123,444] for n in l: if n % 3 != 0: # test whether n is not divisible by 3 without remainder continue print(n)
This code uses the modulo operator % to get the remainder of the division of n and 3 in line 5. If this remainder is not 0, the continue command is executed and, as a result, the program execution directly jumps back to the beginning of the loop and continues with the next number. If the condition is False (meaning the number is divisible by 3), the execution continues as normal after the if-statement and prints out the number. Hopefully, it is immediately clear that the same could have been achieved by changing the condition from != to == and having an if-block with just the print statement, so this is really just a toy example illustrating how continue works.
As you saw in these few examples, there are often multiple ways in which for, while, break, continue, and if-else can be combined to achieve the same thing. While break and continue can be useful commands, they can also make code more difficult to read and understand. Therefore, they should only be used sparingly and when their usage leads to a simpler and more comprehensible code structure than a combination of for /while and if-else would do.
You are already familiar with Python binary operators that can be used to define arbitrarily complex expressions. For instance, you can use arithmetic expressions that evaluate to a number, or boolean expressions that evaluate to either True or False. Here is an example of an arithmetic expression using the arithmetic operators – and *:
x = 25 – 2 * 3
Each binary operator takes two operand values of a particular type (all numbers in this example) and replaces them by a new value calculated from the operands. All Python operators are organized into different precedence classes, determining in which order the operators are applied when the expression is evaluated unless parentheses are used to explicitly change the order of evaluation. This operator precedence table [2] shows the classes from lowest to highest precedence. The operator * for multiplication has a higher precedence than the – operator for subtraction, so the multiplication will be performed first and the result of the overall expression assigned to variable x is 19.
Here is an example for a boolean expression:
x = y > 12 and z == 3
The boolean expression on the right side of the assignment operator contains three binary operators: two comparison operators, > and ==, that take two numbers and return a boolean value, and the logical ‘and’ operator that takes two boolean values and returns a new boolean (True only if both input values are True, False otherwise). The precedence of ‘and’ is lower than that of the two comparison operators, so the ‘and’ will be evaluated last. So if y has the value 6 and z the value 3, the value assigned to variable x by this expression will be False because the comparison on the left side of the ‘and’ evaluates to False.
In addition to all these binary operators, Python has a ternary operator, so an operator that takes three operands as input. This operator has the format
x if c else y
x, y, and c here are the three operands while ‘if’ and ‘else’ are the keywords making up the operator and demarcating the operands. While x and y can be values or expressions of arbitrary type, the condition c needs to be a boolean value or expression. What the operator does is it looks at the condition c and if c is True it evaluates to x, else it evaluates to y. So for example in the following line of code
p = 1 if x > 12 else 0
variable p will be assigned the value 1 if x is larger than 12, else p will be assigned the value 0. Obviously what the ternary if-else operator does is very similar to what we can do with an if or if-else statement. For instance, we could have written the previous code as
p = 1 if x > 12: p = 0
The “x if c else y” operator is an example of a language construct that does not add anything principally new to the language but enables writing things more compactly or more elegantly. That’s why such constructs are often called syntactic sugar. The nice thing about “x if c else y” is that in contrast to the if-else statement, it is an operator that evaluates to a value and, hence, can be embedded directly within more complex expressions as in the following example that uses the operator twice:
newValue = 25 + (10 if oldValue < 20 else 44) / 100 + (5 if useOffset else 0)
Using an if-else statement for this expression would have required at least five lines of code.
In GEOG 485, we used the + operator for string concatenation to produce strings from multiple components to then print them out or use them in some other way, as in the following two examples:
print('The feature class contains ' + str(n) + ' point features.') queryString = '"'+ fieldName+ '" = ' + "'" + countryName + "'"
An alternative to this approach using string concatenation is to use the string method format(…). When this method is invoked for a particular string, the string content is interpreted as a template in which parts surrounded by curly brackets {…} should be replaced by the variables given as parameters to the method. Here is how the two examples from above would look in this approach:
print('The feature class contains {0} point features.'.format(n) ) queryString = '"{0}" = \'{1}\''.format(fieldName, countryName)
In both examples, we have a string literal '….' and then directly call the format(…) method for this string literal to give us a new string in which the occurrences of {…} have been replaced. In the simple form {i} used here, each occurrence of this pattern will be replaced by the i-th parameter given to format(…). In the second example, {0} will be replaced by the value of variable fieldName and {1} will be replaced by variable countryName. Please note that the second example will also use \' to produce the single quotes so that the entire template could be written as a single string. The numbers within the curly brackets can also be omitted if the parameters should be inserted into the string in the order in which they appear.
The main advantages of using format(…) are that the string can be a bit easier to produce and read as in particular in the second example, and that we don’t have to explicitly convert all non-string variables to strings with str(…). In addition, format allows us to include information about how the values of the variables should be formatted. By using {i:n}, we say that the value of the i-th variable should be expanded to n characters if it’s less than that. For strings, this will by default be done by adding spaces after the actual string content, while for numbers, spaces will be added before the actual string representation of the number. In addition, for numbers, we can also specify the number d of decimal digits that should be displayed by using the pattern {i:n.df}. The following example shows how this can be used to produce some well-formatted list output:
items = [('Maple trees', 45.232 ), ('Pine trees', 30.213 ), ('Oak trees', 24.331)] for i in items: '{0:20} {1:3.2f}%'.format(i[0], i[1])
Output:
Maple trees 45.23% Pine trees 30.21% Oak trees 24.33%
The pattern {0:20} is used here to always fill up the names of the tree species in the list with spaces to get 20 characters. Then the pattern {1:3.2f} is used to have the percentage numbers displayed as three characters before the decimal point and two digits after. As a result, the numbers line up perfectly.
The format method can do a few more things, but we are not going to go into further details here. Check out this page about formatted output [3] if you would like to learn more about this.