GEOG 489
Advanced Python Programming for GIS

4.2.2 Sorting

PrintPrint

One common operation on collections is sorting the elements in a collection. Python provides a function sorted(…) to sort the elements in a collection that allows for iterating over the elements (e.g. a list, tuple, or dictionary). The result is always a list. Here are two examples:

l1 = [9,3,5,1,-2]
print(sorted(l1))

l2 = ("Maria", "Frank", "Sam", "Mike")
print(sorted(l2))
Output: 
[-2, 1, 3, 6, 9] 
['Frank', 'Maria', 'Mike', 'Sam'] 

If our collection is a list, we can also use the list method sort() instead of the function sorted(…), e.g. l1.sort() instead of sorted(l1) . Both work exactly the same.

sorted(…) and sort() by default sort the elements in ascending order based on the < comparison operator. This means numbers are sorted in increasing order and strings are sorted in lexicographical order. When we define our own classes (Section 4.6) and want to be able to sort objects of a class based on their properties, we have to define the < operator in a suitable way in the class definition.

The keyword argument ‘reverse’ can be used to sort the elements in descending order instead:

print( sorted(l2, reverse = True) )
Output:

['Sam', 'Mike', 'Maria', 'Frank']

In addition, we can use the keyword argument ‘key’ to provide a function that will be applied to the elements and they will then be sorted based on the values returned by this function. For instance, the following example uses a lambda expression for the ‘key’ parameter to sort the names from l2 based on their length (in descending order) rather than based on their lexicographical order:

print( sorted(l2, reverse = True, key = lambda x: len(x)) )
Output:

['Maria', 'Frank', 'Mike', 'Sam']

Sorting can be a somewhat time-consuming operation for larger collections. Therefore, if you mainly need to access the elements in a collection in a specific order based on their properties (e.g. always the element with the lowest value for a certain attribute), it is advantageous to use a specialized data structure that keeps the collection sorted whenever an element is added or removed. This can save a lot of time compared to frequently re-sorting the collection. An example of such a data structure is the so-called priority queue or heap. The heapq module of the Python standard library implements an algorithm for realizing a priority queue in a Python list and we are going to discuss it in the final part of this section.