Classes

Let’s recapitulate a bit: the underlying perspective of object-oriented programming is that the domain modeled in a program consists of objects belonging to different classes. If your software models some part of the real world, you may have classes for things like buildings, vehicles, trees, etc. and then the objects (also called instances) created from these classes during run-time represent concrete individual buildings, vehicles, or trees with their specific properties. The classes in your software can also describe non real-world and often very abstract things like a feature layer or a random number generator.

Class definitions specify general properties that all objects of that class have in common, together with the things that one can do with these objects. Therefore, they can be considered blueprints for the objects. Each object at any moment during run-time is in a particular state that consists of the concrete values it has for the properties defined in its class. So, for instance, the definition of a very basic class Car may specify that all cars have the properties owner, color, currentSpeed, and lightsOn. During run-time we might then create an object for “Tom’s car” in variable carOfTom with the following values making up its state:

carOfTom.owner = "Tom" 
carOfTom.color = "blue" 
carOfTom.currentSpeed = 48   (mph) 
carOfTom.lightsOn = False

While all objects of the same class have the same properties (also called attributes or fields), their values for these properties may vary and, hence, they can be in different states. The actions that one can perform with a car or things that can happen to a car are described in the form of methods in the class definition. For instance, the class Car may specify that the current speed of cars can be changed to a new value and that lights can be turned on and off. The respective methods may be called changeCurrentSpeed(…), turnLightsOn(), and turnLightsOff(). Methods are like functions but they are explicitly invoked on an object of the class they are defined in. In Python this is done by using the name of the variable that contains the object, followed by a dot, followed by the method name:

carOfTom.changeCurrentSpeed(34) # change state of Tom’s car to current speed being 34mph 

carOfTom.turnLightsOn()# change state of Tom’s car to lights being turned on

The purpose of methods can be to update the state of the object by changing one or several of its properties as in the previous two examples. It can also be to get information about the state of the car, e.g. are the lights turned on? But it can also be something more complicated, e.g. performing a certain driving maneuver or fuel calculation.

In object-oriented programming, a program is perceived as a collection of objects that interact by calling each other’s methods. Object-oriented programming adheres to three main design principles:

Encapsulation: Definitions related to the properties and methods of any class appear in a specification that is encapsulated independently from the rest of the software code and properties are only accessible via a well-defined interface, e.g. via the defined methods.
Inheritance: Classes can be organized hierarchically with new classes being derived from previously defined classes inheriting all the characteristics of the parent class but potentially adding specialized properties or specialized behavior. For instance, our class Car could be derived from a more general class Vehicle adding properties and methods that are specific for cars.
Polymorphism: Inherited classes can change the behavior of methods by overwriting them and the code executed when such a method is invoked for an object then depends on the class of that object.

We will talk more about inheritance and polymorphism soon. All three principles aim at improving reusability and maintainability of software code. These days, most software is created by mainly combining parts that already exist because that saves time and costs and increases reliability when the re-used components have already been thoroughly tested. The idea of classes as encapsulated units within a program increases reusability because these units are then not dependent on other code and can be moved over to a different project much more easily.

For now, let’s look at how our simple class Car can be defined in Python.

 class Car(): 

     def __init__(self): 
          self.owner = 'UNKNOWN' 
          self.color = 'UNKNOWN' 
          self.currentSpeed = 0 
          self.lightsOn = False 

     def changeCurrentSpeed(self,newSpeed): 
          self.currentSpeed = newSpeed 

     def turnLightsOn(self): 
          self.lightsOn = True 

     def turnLightsOff(self): 
          self.lightsOn = False 

     def printInfo(self): 
          print('Car with owner = {0}, color = {1}, currentSpeed = {2}, lightsOn = {3}'.format(self.owner, self.color, self.currentSpeed, self.lightsOn))

Here is an explanation of the different parts of this class definition: each class definition in Python starts with the keyword ‘class’ followed by the name of the class (‘Car’) followed by parentheses that may contain names of classes that this class inherits from, but that’s something we will only see later on. The rest of the class definition is indented to the right relative to this line.

The rest of the class definition consists of definitions of the methods of the class which all look like function definitions but have the keyword ‘self’ as the first parameter, which is an indication that this is a method. The method __init__(…) is a special method called the constructor of the class. It will be called when we create a new object of that class like this:

carOfTom = Car()    # uses the __init__() method of Car to create a new Car object

In the body of the constructor, we create the properties of the class Car. Each line starting with “self.<name of property> = ...“ creates a so-called instance variable for this car object and assigns it an initial value, e.g. zero for the speed. The instance variables describing the state of an object are another type of variable in addition to global and local variables that you already know. They are part of the object and exist as long as that object exists. They can be accessed from within the class definition as “self.<name of the instance variable>” which happens later in the definitions of the other methods, namely in lines 10, 13, 16 and 19. If you want to access an instance variable from outside the class definition, you have to use <name of variable containing the object>.<name of the instance variable>, so, for instance:

print(carOfTom.lightsOn)    # will produce the output False because right now this instance variable still has its default value

The rest of the class definition consists of the methods for performing certain actions with a Car object. You can see that the already mentioned methods for changing the state of the Car object are very simple. They just assign a new value to the respective instance variable, a new speed value that is provided as a parameter in the case of changeCurrentSpeed(…) and a fixed Boolean value in the cases of turnLightsOn() and turnLightsOff(). In addition, we added a method printInfo() that prints out a string with the values of all instance variables to provide us with all information about a car’s current state. Let us now create a new instance of our Car class and then use some of its methods:

carOfSue = Car() 
carOfSue.owner = 'Sue' 
carOfSue.color = 'white' 
carOfSue.changeCurrentSpeed(41) 
carOfSue.turnLightsOn() 
carOfSue.printInfo()

Output

Car with owner = Sue, color = white, currentSpeed = 41, lightsOn = True

Since we did not define any methods to change the owner or color of the car, we are directly accessing these instance variables and assigning new values to them in lines 2 and 3. While this is okay in simple examples like this, it is recommended that you provide so-called getter and setter methods (also called accessor and mutator methods) for all instance variables that you want the user of the class to be able to read (“get”) or change (“set”). The methods allow the class to perform certain checks to make sure that the object always remains in an allowed state. How about you go ahead and for practice create a second car object for your own car (or any car you can think of) in a new variable and then print out its information?

A method can call any other method defined in the same class by using the notation “self.<name of the method>(...)”. For example, we can add the following method randomSpeed() to the definition of class Car:

def setRandomSpeed(self): 
    self.changeCurrentSpeed(random.randint(0,76))

The new method requires the “random” module to be imported at the beginning of the script. The method generates a random number and then uses the previously defined method changeCurrentSpeed(…) to actually change the corresponding instance variable. In this simple example, one could have simply changed the instance variable directly but in more complex cases changes to the state can require more code so that this approach here actually avoids having to repeat that code. Give it a try and add some lines to call this new method for one of the car objects and then print out the info again.

Lesson content developed by Jan Wallgrun and James O’Brien

Inheritance, class hierarchies, and polymorphism

We already mentioned building class hierarchies via inheritance and polymorphism as two main principles of object-oriented programming in addition to encapsulation. To introduce you to these concepts, let us start with another exercise in object-oriented modeling and writing classes in Python. Imagine that you are supposed to write a very basic GIS or vector drawing program that only deals with geometric features of three types: circles, and axis-aligned rectangles and squares. You need the ability to store and manage an arbitrary number of objects of these three kinds and be able to perform simple operations with these objects like computing their area and perimeter and moving the objects to a different position. How would you write the classes for these three kinds of geometric objects?

Let us start with the class Circle: a circle in a two-dimensional coordinate system is typically defined by three values, the x and y coordinates of the center of the circle and its radius. So these should become the properties (= instance variables) of our Circle class and for computing the area and perimeter, we will provide two methods that return the respective values. The method for moving the circle will take the values by how much the circle should be moved along the x and y axes as parameters but not return anything.

import math  

class Circle():  
    def __init__(self, x = 0.0, y = 0.0, radius = 1.0):  
        self.x = x  
        self.y = y  
        self.radius = radius  

    def computeArea(self):  
        return math.pi * self.radius ** 2 

    def computePerimeter (self):  
        return 2 * math.pi * self.radius  

    def move(self, deltaX, deltaY):  
        self.x += deltaX  
        self.y += deltaY 

    def __str__(self):  
        return 'Circle with coordinates {0}, {1} and radius {2}'.format(self.x, self.y, self.radius)

In the constructor, we have keyword arguments with default values for the three properties of a circle and we assign the values provided via these three parameters to the corresponding instance variables of our class. We import the math module of the Python standard library so that we can use the constant math.pi for the computations of the area and perimeter of a circle object based on the instance variables. Finally, we add the __str__() method to produce a string that describes a circle object with its properties. It should by now be clear how to create objects of this class and, for instance, apply the computeArea() and move(…) methods.

circle1 = Circle(10,4,3) 
print(circle1) 
print(circle1.computeArea()) 
circle1.move(3,-1) 
print(circle1)

Output
Circle with coordinates 10, 4 and radius 3 
28.274333882308138 
Circle with coordinates 13, 3 and radius 3

How about a similar class for axis-aligned rectangles? Such rectangles can be described by the x and y coordinates of one of their corners together with width and height values, so four instance variables taking numeric values in total. Here is the resulting class and a brief example of how to use it:

class Rectangle(): 
	def __init__(self, x = 0.0, y = 0.0, width = 1.0, height = 1.0): 
		self.x = x 
		self.y = y 
		self.width = width 
		self.height = height 

    def computeArea(self): 
		return self.width * self.height 

    def computePerimeter (self): 
		return 2 * (self.width + self.height) 

    def move(self, deltaX, deltaY): 
		self.x += deltaX 
		self.y += deltaY 

	def __str__(self): 
		return 'Rectangle with coordinates {0}, {1}, width {2} and height {3}'.format(self.x, self.y, self.width, self.height ) 

rectangle1 = Rectangle(10,10,3,2) 
print(rectangle1) 
print(rectangle1.computeArea()) 
rectangle1.move(2,2) 
print(rectangle1)

Output
Rectangle with coordinates 10, 10, width 3 and height 2 
6 
Rectangle with coordinates 12, 12, width 3 and height 2

There are a few things that can be observed when comparing the two classes Circle and Rectangle we just created: the constructors obviously vary because circles and rectangles need different properties to describe them and, as a result, the calls when creating new objects for the two classes also look different. All the other methods have exactly the same signature, meaning the same parameters and the same kind of return value; just the way they are implemented differs. That means the different calls for performing certain actions with the objects (computing the area, moving the object, printing information about the object) also look exactly the same; it doesn’t matter whether the variable contains an object of class Circle or of class Rectangle. If you compare the two versions of the move(…) method, you will see that these even do not differ in their implementation, they are exactly the same!

This all is a clear indication that we are dealing with two classes of objects that could be seen as different specializations of a more general class for geometric objects. Wouldn’t it be great if we could now write the rest of our toy GIS program managing a set of geometric objects without caring whether an object is a Circle or a Rectangle in the rest of our code? And, moreover, be able to easily add classes for other geometric primitives without making any changes to all the other code, and in their class definitions only describe the things in which they differ from the already defined geometry classes? This is indeed possible by arranging our geometry classes in a class hierarchy starting with an abstract class for geometric objects at the top and deriving child classes for Circle and Rectangle from this class with both adding their specialized properties and behavior. Let’s call the top-level class Geometry. The resulting very simple class hierarchy is shown in the figure below.

Flowchart with "Geometry" leading to "Circle" and "Rectangle."

Figure 4.17 Simple class hierarchy with three classes. Classes Circle and Rectangle are both derived from parent class Geometry.

Inheritance allows the programmer to define a class with general properties and behavior and derive one or more specialized subclasses from it that inherit these properties and behavior but also can modify them to add more specialized properties and realize more specialized behavior. We use the terms derived class and base class to refer to the two classes involved when one class is derived from another.

Lesson content developed by Jan Wallgrun and James O’Brien

Implementing the class hierarchy

Let’s change our example so that both Circle and Rectangle are derived from such a general class called Geometry. This class will be an abstract class in the sense that it is not intended to be used for creating objects from. Its purpose is to introduce properties and templates for methods that all geometric classes in our project have in common.

class Geometry():  

    def __init__(self, x = 0.0, y = 0.0):  
        self.x = x  
        self.y = y  

    def computeArea(self):  
        pass 

    def computePerimeter(self):  
        pass 

    def move(self, deltaX, deltaY):  
        self.x += deltaX  
        self.y += deltaY  

    def __str__(self):  
        return 'Abstract class Geometry should not be instantiated and derived classes should override this method!'

The constructor of class Geometry looks pretty normal, it just initializes the instance variables that all our geometry objects have in common, namely x and y coordinates to describe their location in our 2D coordinate system. This is followed by the definitions of the methods computeArea(), computePerimeter(), move(…), and __str__() that all geometry objects should support. For move(…), we can already provide an implementation because it is entirely based on the x and y instance variables and works in the same way for all geometry objects. That means the derived classes for Circle and Rectangle will not need to provide their own implementation. In contrast, you cannot compute an area or perimeter in a meaningful way just from the position of the object. Therefore, we used the keyword pass to indicate that we are leaving the body of the computeArea() and computePerimeter() methods intentionally empty. These methods will have to be overridden in the definitions of the derived classes with implementations of their specialized behavior. We could have done the same for __str__() but instead we return a warning message that this class should not have been instantiated.

It is worth mentioning that, in many object-oriented programming languages, the concepts of an abstract class (= a class that cannot be instantiated) and an abstract method (= a method that must be overridden in every subclass that can be instantiated) are built into the language. That means there exist special keywords to declare a class or method to be abstract and then it is impossible to create an object of that class or a subclass of it that does not provide an implementation for the abstract methods. In Python, this has been added on top of the language via a module in the standard library called abc [1] (for abstract base classes). Although we won’t be using it in this course, it is a good idea to check it out and use it if you get involved in larger Python projects. This Abstract Classes page [2] is a good source for learning more.

Here is our new definition for class Circle that is now derived from class Geometry. We also use a few commands at the end to create and use a new Circle object of this class to make sure everything is indeed working as before:

import math  

class Circle(Geometry): 

	def __init__(self, x = 0.0, y = 0.0, radius = 1.0): 
		super(Circle,self).__init__(x,y) 
		self.radius = radius 

	def computeArea(self): 
		return math.pi * self.radius ** 2 

	def computePerimeter (self): 
		return 2 * math.pi * self.radius 

	def __str__(self): 
		return 'Circle with coordinates {0}, {1} and radius {2}'.format(self.x, self.y, self.radius) 

circle1 = Circle(10, 10, 10) 
print(circle1.computeArea()) 
print(circle1.computePerimeter()) 
circle1.move(2,2) 
print(circle1)

Here are the things we needed to do in the code:

In line 3, we had to change the header of the class definition to include the name of the base class we are deriving Circle from (‘Geometry’) within the parentheses.
The constructor of Circle takes the same three parameters as before. However, it only initializes the new instance variable radius in line 7. For initializing the other two variables it calls the constructor of its base class, so the class Geometry, in line 6 with the command “super(Circle,self).__init__(x,y)”. This is saying “call the constructor of the base class of class Circle and pass the values of x and y as parameters to it”. It is typically a good idea to call the constructor of the base class as the first command in the constructor of the derived class so that all general initializations are taken care off.
Then we provide definitions of computeArea() and computePerimeter() that are specific for circles. These definitions override the “empty” definitions of the Geometry base class. This means whenever we invoke computeArea() or computePerimeter() for an object of class Circle, the code from these specialized definitions will be executed.
Note that we do not provide any definition for method move(…) in this class definition. That means when move(…) will be invoked for a Circle object, the code from the corresponding definition in its base class Geometry will be executed.
We do override the __str__() method to produce the same kind of string with information about all instance variables that we had in the previous definition. Note that this function accesses both the instance variables defined in the parent class Geometry as well as the additional one added in the definition of Circle.

The new definition of class Rectangle, now derived from Geometry, looks very much the same as that of Circle if you replace “Circle” with “Rectangle”. Only the implementations of the overridden methods look different, using the versions specific for rectangles.

class Rectangle(Geometry): 

	def __init__(self, x = 0.0, y = 0.0, width = 1.0, height = 1.0): 
		super(Rectangle, self).__init__(x,y) 
		self.width = width 
        self.height = height 

	def computeArea(self): 
		return self.width * self.height 

	def computePerimeter (self): 
		return 2 * (self.width + self.height) 

	def __str__(self): 
		return 'Rectangle with coordinates {0}, {1}, width {2} and height {3}'.format(self.x, self.y, self.width, self.height ) 

rectangle1 = Rectangle(15,20,4,5) 
print(rectangle1.computeArea()) 
print(rectangle1.computePerimeter()) 
rectangle1.move(2,2) 
print(rectangle1)

Lesson content developed by Jan Wallgrun and James O’Brien

Class attributes and static class functions

In this section we are going to look at two additional concepts that can be part of a class definition, namely class variables/attributes and static class functions. We will start with class attributes even though it is the less important one of these two concepts and won't play a role in the rest of this lesson. Static class functions, on the other hand, will be used in the walkthrough code of this lesson and also will be part of the homework assignment.

We learned in this lesson that for each instance variable defined in a class, each object of that class possesses its own copy so that different objects can have different values for a particular attribute. However, sometimes it can also be useful to have attributes that are defined only once for the class and not for each individual object of the class. For instance, if we want to count how many instances of a class (and its subclasses) have been created while the program is being executed, it would not make sense to use an instance variable with a copy in each object of the class for this. A variable existing at the class level is much better suited for implementing this counter and such variables are called class variables or class attributes. Of course, we could use a global variable for counting the instances but the approach using a class attribute is more elegant as we will see in a moment.

The best way to implement this instance counter idea is to have the code for incrementing the counter variable in the constructor of the class because that means we don’t have to add any other code and it’s guaranteed that the counter will be increased whenever the constructor is invoked to create a new instance. The definition of a class attribute in Python looks like a normal variable assignment but appears inside a class definition outside of any method, typically before the definition of the constructor. Here is what the definition of a class attribute counter for our Geometry class could look like. We are adding the attribute to the root class of our hierarchy so that we can use it to count how many geometric objects have been created in total.

class Geometry(): 
   counter = 0 

   def __init__(self, x = 0.0, y = 0.0): 
      self.x = x 
      self.y = y 
      Geometry.counter += 1 
…

The class attribute is defined in line 2 and the initial value of zero is assigned to it when the class is loaded so before the first object of this class is created. We already included a modified version of the constructor that increases the value of counter by one. Since each constructor defined in our class hierarchy calls the constructor of its base class, the counter class attribute will be increased for every geometry object created. Please note that the main difference between class attributes and instance variables in the class definition is that class attributes don’t use the prefix “self.” but the name of the class instead, so Geometry.counter in this case. Go ahead and modify your class Geometry in this way, while keeping all the rest of the code unchanged.

While instance variables can only be accessed for an object, e.g. using <variable containing the object>.<name of the instance variable>, we can access class attributes by using the name of the class, i.e. <name of the class>.<name of the class attribute>. That means you can run the code and use the statement

print(Geometry.counter)

… to get the value currently stored in this new class attribute. Since we have not created any geometry objects since making this change, the output should be 0.

Let’s now create two geometry objects of different types, for instance, a circle and a square:

Circle(10,10,10) 
Square(5,5,8)

Now run the previous print statement again and you will see that the value of the class variable is now 2. Class variables like this are suitable for storing all information related to the class, so essentially everything that does not describe the state of individual objects of the class.

Class definitions can also contain definitions of functions that are not methods, meaning they are not invoked for a specific object of that class and they do not access the state of a particular object. We will refer to such functions as static class functions. Like class attributes they will be referred to from code by using the name of the class as prefix. Class functions allow for implementing some functionality that is in some way related to the class but not the state of a particular object. They are also useful for providing auxiliary functions for the methods of the class. It is important to note that since static class functions are associated with the class but not an individual object of the class, you cannot directly refer to the instance variables in the body of a static class function like you can in the definitions of methods. However, you can refer to class attributes as you will see in a moment.

A static class function definition can be distinguished from the definition of a method by the lack of the “self” as the first parameter of the function; so it looks like a normal function definition but is located inside a class definition. To give a very simple example of a static class function, let’s add a function called printClassInfo() to class Geometry that simply produces a nice output message for our counter class attribute:

class Geometry(): 
    … 

    def printClassInfo(): 
        print( "So far, {0} geometric objects have been created".format(Geometry.counter) )

We have included the header of the class definition to illustrate how the definition of the function is embedded into the class definition. You can place the function definition at the end of the class definition, but it doesn’t really matter where you place it, you just have to make sure not to paste the code into the definition of one of the methods. To call the function you simply write:

Geometry.printClassInfo()

The exact output depends on how many objects have been created but it will be the current value of the counter class variable inserted into the text string from the function body.

Go ahead and save your completed geometry script since we'll be using it later in this lesson.

In the program that we will develop in the walkthroughs of this lesson, we will use static class functions that work somewhat similarly to the constructor in that they can create and return new objects of the class but only if certain conditions are met. We will use this idea to create event objects for certain events detected in bus GPS track data. The static functions defined in the different bus event classes (called detect()) will be called with the GPS data and only return an object of the respective event class if the conditions for this kind of bus event are fulfilled. Here is a sketch of a class definition that illustrates this idea:

class SomeEvent(): 
    ...

    # static class function that creates and returns an object of this class only if certain conditions are satisfied
    def detect(data): 
        ... # perform some tests with data provided as parameter
        if ...: # if conditions are satisfied, use constructor of SomeEvent to create an object and return that object
              return SomeEvent(...)
        else:   # else the function returns None
              return None

# calling the static class function from outside the class definition,
# the returned SomeEvent object will be stored in variable event
event = SomeEvent.detect(...)
if event: # test whether an object has been returned
    ... # do something with the new SomeEvent object

Lesson content developed by Jan Wallgrun and James O’Brien