GEOG 489
Advanced Python Programming for GIS

1.4.3 Local vs. global, mutable vs. immutable

PrintPrint

When making the transition from a beginner to an intermediate or advanced Python programmer, it also gets important to understand the intricacies of variables used within functions and of passing parameters to functions in detail. First of all, we can distinguish between global and local variables within a Python script. Global variables are defined outside of any function. They can be accessed from anywhere in the script and they exist and keep their values as long as the script is loaded which typically means as long as the Python interpreter into which they are loaded is running.

In contrast, local variables are defined inside a function and can only be accessed in the body of that function. Furthermore, when the body of the function has been executed, its local variables will be discarded and cannot be used anymore to access their current values. A local variable is either a parameter of that function, in which case it is assigned a value immediately when the function is called, or it is introduced in the function body by making an assignment to the name for the first time.

Here are a few examples to illustrate the concepts of global and local variables and how to use them in Python.

def doSomething(x):      # parameter x is a local variable of the function

     count = 1000 * x    # local variable count is introduced

     return count 

 

y = 10            # global variable y is introduced 

print(doSomething(y)) 

print(count)      # this will result in an error 

print(x)          # this will also result in an error

This example introduces one global variable, y, and two local variables, x and count, both part of the function doSomething(…). x is a parameter of the function, while count is introduced in the body of the function in line 3. When this function is called in line 11, the local variable x is created and assigned the value that is currently stored in global variable y, so the integer number 10. Then the body of the function is executed. In line 3, an assignment is made to variable count. Since this variable hasn’t been introduced in the function body before, a new local variable will now be created and assigned the value 10000. After executing the return statement in line 5, both x and count will be discarded. Hence, the two print statements at the end of the code would lead to errors because they try to access variables that do not exist anymore.

Now let’s change the example to the following:

def doSomething(): 

     count = 1000 * y    # global variable y is accessed here

     return count 

 

y = 10          

print(doSomething()) 

This example shows that global variable y can also be directly accessed from within the function doSomething(): When Python encounters a variable name that is neither the name of a parameter of that function nor has been introduced via an assignment previously in the body of that function, it will look for that variable among the global variables. However, the first version using a parameter instead is usually preferable because then the code in the function doesn’t depend on how you name and use variables outside of it. That makes it much easier to, for instance, re-use the same function in different projects.

So maybe you are wondering whether it is also possible to change the value of a global variable from within a function, not just read its value? One attempt to achieve this could be the following:

 def doSomething(): 

     count = 1000  

     y = 5 

     return count * y 

 

y = 10 

print(doSomething()) 

print(y)      # output will still be 10 here

However, if you run the code, you will see that last line still produces the output 10, so the global variable y hasn't been changed by the assignment in line 5. That is because the rule is that if a name is encountered on the left side of an assignment in a function, it will be considered a local variable. Since this is the first time an assignment to y is made in the body of the function, a new local variable with that name is created at that point that will overshadow the global variable with the same name until the end of the function has been reached. Instead, you explicitly have to tell Python that a variable name should be interpreted as the name of a global variable by using the keyword ‘global’, like this:

def doSomething(): 

     count = 1000 

     global y      # tells Python to treat y as the name of global variable

     y = 5         # as a result, global variable y is assigned a new value here

     return count * y 

 

y = 10 

print(doSomething()) 

print(y)       # output will now be 5 here

In line 5, we are telling Python that y in this function should refer to the global variable y. As a result, the assignment in line 7 changes the value of the global variable called y and the output of the last line will be 5. While it's good to know how these things work in Python, we again want to emphasize that accessing global variables from within functions should be avoided as much as possible. Passing values via parameters and returning values is usually preferable because it keeps different parts of the code as independent of each other as possible.

So after talking about global vs. local variables, what is the issue with mutable vs. immutable mentioned in the heading? There is an important difference in passing values to a function depending on whether the value is from a mutable or immutable data type. All values of primitive data types like numbers and boolean values in Python are immutable, meaning you cannot change any part of them. On the other hand, we have mutable data types like lists and dictionaries for which it is possible to change their parts: You can, for instance, change one of the elements in a list or what is stored under a particular key in a given dictionary without creating a completely new object.

What about strings and tuples? You may think these are mutable objects, but they are actually immutable. While you can access a single character from a string or element from a tuple, you will get an error message if you try to change it by using it on the left side of the equal sign in an assignment. Moreover, when you use a string method like replace(…) to replace all occurrences of a character by another one, the method cannot change the string object in memory for which it was called but has to construct a new string object and return that to the caller.

Why is that important to know in the context of writing functions? Because mutable and immutable data types are treated differently when provided as a parameter to functions as shown in the following two examples:

def changeIt(x): 

     x = 5   # this does not change the value assigned to y

 
y = 3 

changeIt(y) 

print(y)     # will print out 3

As we already discussed above, the parameter x is treated as a local variable in the function body. We can think of it as being assigned a copy of the value that variable y contains when the function is called. As a result, the value of the global variable y doesn’t change and the output produced by the last line is 3. But it only works like this for immutable objects, like numbers in this case! Let’s do the same thing for a list:

def changeIt(x): 

     x[0] = 5   # this will change the list y refers to

 
y = [3,5,7] 

changeIt(y)     

print(y)        # output will be [5, 5, 7]

The output [5,5,7] produced by the print statement in the last line shows that the assignment in line 3 changed the list object that is stored in global variable y. How is that possible? Well, for values of mutable data types like lists, assigning the value to function parameter x cannot be conceived as creating a copy of that value and, as a result, having the value appear twice in memory. Instead, x is set up to refer to the same list object in memory as y. Therefore, any change made with the help of either variable x or y will change the same list object in memory. When variable x is discarded when the function body has been executed, variable y will still refer to that modified list object. Maybe you have already heard the terms “call-by-value” and “call-by-reference” in the context of assigning values to function parameters in other programming languages. What happens for immutable data types in Python works like “call-by-value,” while what happens to mutable data types works like “call-by-reference.” If you feel like learning more about the details of these concepts, check out this article on Parameter Passing.

While the reasons behind these different mechanisms are very technical and related to efficiency, this means it is actually possible to write functions that take parameters of mutable type as input and modify their content. This is common practice (in particular for class objects which are also mutable) and not generally considered bad style because it is based on function parameters and the code in the function body does not have to know anything about what happens outside of the function. Nevertheless, often returning a new object as the return value of the function rather than changing a mutable parameter is preferable. This brings us to the last part of this section.