GEOG 489
Advanced Python Programming for GIS

2.2 List comprehension


Like the first lesson, we are going to start Lesson 2 with a bit of Python theory. From mathematics, you probably are familiar with the elegant way of defining sets based on other sets using a compact notation as in the example below:

M = { 1, 5 ,9, 27, 31}
N = {x2, x ∈ M ∧ x > 11}

What is being said here is that the set N should contain the squares of all numbers in set M that are larger than 11. The notation uses { … } to indicate that we are defining a set, then an expression that describes the elements of the set based on some variable (x2) followed by a set of criteria specifying the values that this variable (x) can take (x ∈ M and x > 11).

This kind of compact notation has been adopted by Python for defining lists and it is called list comprehension. A list comprehension has the general form

[< new value expression using variable> for <variable> in <list> if <condition for variable>]

The fixed parts are written in bold here, while the parts that need to be replaced by some expressions using some variable are put into angular brackets <..> . The if and following condition are optional. To give a first example, here is how this notation can be used to create a list containing the squares of the numbers from 1 to 10:

squares = [ x**2 for x in range(1,11) ] 
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100] 

In case you haven’t seen this before, ** is the Python operator for a to the power of b.

What happens when Python evaluates this list comprehension is that it goes through the numbers in the list produced by range(1,11), so the numbers from 1 to 10, and then evaluates the expression x**2 with each of these numbers assigned to variable x. The results are collected to form the new list produced by the entire list comprehension. We can easily extend this example to only include the squares of numbers that are even:

evenNumbersSquared = [ x**2 for x in range(1,11) if x % 2 == 0 ] 
[4, 16, 36, 64, 100]

This example makes use of the optional if condition to make sure that the new value expression is only evaluated for certain elements from the original list, namely those for which the remainder of the division by 2 with the Python modulo operator % is zero. To show that this not only works with numbers, here is an example in which we use list comprehension to simply reduce a list of names to those names that start with the letter ‘M’ or the letter ‘N’:

names = [ 'Monica', 'John', 'Anne', 'Mike', 'Nancy', 'Peter', 'Frank', 'Mary' ] 
namesFiltered = [ n for n in names if n.startswith('M') or n.startswith('N') ] 
['Monica', 'Mike', 'Nancy', 'Mary']

This time, the original list is defined before the actual list comprehension rather than inside it as in the previous examples. We are also using a different variable name here (n) so that you can see that you can choose any name here but, of course, you need to use that variable name consistently directly after the for and in the condition following the if. The new value expression is simply n because we want to keep those elements from the original list that satisfy the condition unchanged. In the if condition, we use the string method startswith(…) twice connected by the logical or operator to check whether the respective name starts with letter ‘M’ or the letter ‘N’.

Surely, you are getting the general idea and how list comprehension provides a compact and elegant way to produce new lists from other lists by (a) applying the same operation to the elements of the original list and (b) optionally using a condition to filter the elements from the original list before this happens. The new value expression can be arbitrarily complex involving multiple operators as well as function calls. It is also possible to use several variables, either with each variable having its own list to iterate through corresponding to nested for-loops, or with a list of tuples as in the following example:

pairs = [ (21,23), (12,3), (3,11) ] 
sums = [ x + y for x,y in pairs ] 

[44, 15, 14] 

With “for x,y in pairs” we here go through the list of pairs and for each pair, x will be assigned the first element of that pair and y the second element. Then these two variables will be added together based on the expression x + y and the result will become part of the new list. Often we find this form of a list comprehension used together with the zip(…) function from the Python standard library that takes two lists as parameters and turns them into a list of pairs. Let’s say we want to create a list that consists of the pairwise sums of corresponding elements from two input lists. We can do that as follows:

list1 = [ 1, 4, 32, 11 ] 
list2 = [ 3, 2, 1, 99 ] 

sums = [ x + y for x,y in zip(list1,list2) ] 
[4, 6, 33, 110]

The expression zip(list1,list2) will produce the list of pairs [ (1,3), (4,2), (32,1), (11,99) ] from the two input lists and then the rest works in the same way as in the previous example.

Most of the examples of list comprehensions that you will encounter in the rest of this course will be rather simple and similar to the examples you saw in this section. We will also practice writing list comprehensions a bit in the practice exercises of this lesson. If you'd like to read more about them and see further examples, there are a lot of good tutorials and blogs out there on the web if you search for Python + list comprehension, like this List Comprehensions in Python page for example. As a last comment, we focussed on list comprehension in this section but the same technique can also be applied to other Python containers such as dictionaries, for example. If you want to see some examples, check out the section on "Dictionary Comprehension" in this article here.