GEOG 485:
GIS Programming and Software Development

Lesson 4 Practice Exercise B Solution

PrintPrint

This practice exercise is a little trickier than previous exercises. If you were not able to code a solution, study the following solution carefully and make sure you know the purpose of each line of code.

The code below refers to the "winner" and "loser" of each game. This really refers to the first score given and the second score given, in the case of a tie.

# Reads through a text file of soccer (football)
#  scores and reports the highest number of goals
#  in one game for each team

# ***** DEFINE FUNCTIONS *****

# This function checks if the number of goals scored
#  is higher than the team's previous max.
def checkGoals(team, goals, dictionary):
    #Check if the team has a key in the dictionary
    if team in dictionary:
        # If a key was found, check goals against team's current max
        if goals > dictionary[team]:
            dictionary[team] = goals
        else:
            pass
    # If no key found, add one with current number of goals
    else:
        dictionary[team] = goals

# ***** BEGIN SCRIPT BODY *****

import csv

# Open the text file of scores
scoresFilePath = "C:\\Users\\jed124\\Documents\\geog485\\Lesson4\\Lesson4PracticeExercises\\Lesson4PracticeExerciseB\\Scores.txt"
with open(scoresFilePath) as scoresFile:
    # Read the header line and get the important field indices
    csvReader = csv.reader(scoresFile, delimiter=" ")
    header = next(csvReader)
    
    winnerIndex = header.index("Winner")
    winnerGoalsIndex = header.index("WG")
    loserIndex = header.index("Loser")
    loserGoalsIndex = header.index("LG")
    
    # Create an empty dictionary. Each key will be a team name.
    #  Each value will be the maximum number of goals for that team.
    maxGoalsDictionary = {}
    
    for row in csvReader:
    
        # Create variables for all items of interest in the line of text    
        winner = row[winnerIndex]
        winnerGoals = int(row[winnerGoalsIndex])
        loser = row[loserIndex]
        loserGoals = int(row[loserGoalsIndex])
        
        # Check the winning number of goals against the team's max
        checkGoals(winner, winnerGoals, maxGoalsDictionary)
        
        # Also check the losing number of goals against the team's max    
        checkGoals(loser, loserGoals, maxGoalsDictionary)

    # Print the results
    for key in maxGoalsDictionary:
        print (key + ": " + str(maxGoalsDictionary[key]))

Below is a video offering some line-by-line commentary on the structure of this solution. 

Video: Solution to Lesson 4 Practice Exercise B (13:03)

Click here for transcript of Solution to Lesson 4 Practice Exercise B.

This video describes one possible solution for Lesson 4 practice exercise B where you are reading the names of soccer teams in Buenos Aires, looking at their scores, and then compiling a report about the top number of goals scored by each team over the course of the games covered by the file.

In order to maintain all this information, it's helpful to use a dictionary, which is a way of storing information in the computer's memory based on key-value pairs. So, the key in this case will be the name of the team, and the value will be the maximum number of goals found for that team as we read the file line by line. Now, this file is sort of like a comma separated value file, though the delimiter in this case is a space. The file does have a header, which we'll use to pull out information.

The header is organized in terms of winner, winner goals, loser, and loser goals. Although, really, there are some ties in here. So, we might say first score and second score rather than winner and loser. For our purposes, it doesn't matter who won or lost because the maximum number of goals might have come during a loss or a tie.

So, this solution is a little more complex than some of the other practice exercises. It involves a function which you can see beginning in line 9, but I'm not going to describe the function just yet. I'll wait until we get to the point where we need the logic that's in that function. So, I'll start explaining this solution by going to line 23, where we import the Python CSV module.

Now, there's nothing in this script that uses ArcGIS or arcpy geometries or anything like that, so I don't import arcpy at all. But you will do that in Project 4, where you'll use a combination of the techniques used here along with ArcGIS geometries and really put everything together from both the practice exercises. In line 26, we set up a variable representing the path to the scores text file. And in line 27, we actually open the file. By default, I'm opening it here in read mode. The opening mode parameter is not specifically supplied here.

In line 29, we create the CSV reader object. You should be familiar with this from the other examples in the lesson and the other practice exercise. One thing that's different here is, as a second parameter, we can specify the delimiter using this type of syntax-- delimiter equals, and then the space character.

Again, this file does have a header. So in line 30, we'll read the header, and then we figure out the index positions of all of the columns in the file. That's what's going on in lines 32 through 35.

Now, we know that these columns are in a particular order.  But writing it in this way where we use the header.index method makes the script a little more flexible in case the column order had been shifted around by somebody, which could easily happen if somebody had previously opened this file in a spreadsheet program and moved things around.

In line 39, we're going to create a blank dictionary to keep track of each team and the maximum number of goals they've scored. We'll refer to that dictionary frequently as we read through the file.

In line 41, we begin a loop that actually starts reading data in the file below the header. And so, lines 44 through 47 are pulling out those four pieces of information-- basically, the two team names and the number of goals that each scored. Note that the int() function is used to convert the number of goals in string format to integers. Now, when we get a team name and a number of goals, we need to check it against our dictionary to see if the number of goals scored is greater than that team's maximum that we’ve encountered so far. And we need to do this check for both the winner and the loser-- or in other words, the first team and the second team listed in the row.

To avoid repeating code, this is a good case for a function. Because we're going to use the same logic for both pieces so why not write the code just once in a function? So, in lines 50 and 53, you'll see that I'm invoking a function called checkGoals. And I pass in three things. I pass in the team name, I pass in the number of goals, and the dictionary.

This function is defined up here in line 9. Line 9 defines a function called checkGoals, and I create variables here for those three things that the function needs to do its job -- the team, the number of goals, and the dictionary. Line 11 performs a check to see if the team already has a key in the dictionary. If it does, then we need to look at the number of goals that have been stored for the team and check it against the score that was passed into the function to see if that maximum number of goals needs to be updated.

So in line 13, that check is occurring. And if indeed the number of goals passed into the function is greater than the maximum that we’ve run into so far, then in line 14, we update the team’s entry in the dictionary, setting it equal to what was passed into the goals variable. If the number of goals passed into the function is not greater than our maximum, then we don't want to do anything. So that's what's in line 16 where it says pass. The pass is just a keyword that means don't do anything here. And really, we could eliminate the else clause altogether if we wanted to.

Now, if the team has never been read before, and it doesn't have an entry in the dictionary, then we're going to jump down to line 18. We're going to add the team to the dictionary, and we're going to set its max goals to the number passed into the goals variable. No need to check against another number, since there’s nothing in the dictionary yet for that team.

Now, what we can do to get a better feel for how this script works is to run it with the debugging tools. So I’ve inserted a breakpoint on the first line inside the checkGoals function, and I'm going to run the script up until the first time this function gets called. And I can click on the Variables window here to follow along with what’s happening to my variables.

I can see that there are global and local variables.  The local variables are the ones defined within the checkGoals function since that's where the script execution is currently paused.  If I look at the globals list, I can find the row variable, which holds the list of values from the first line of the file.  The variables I want to focus on though are the ones defined as part of the function, the locals -- the dictionary, goals, and team variables.

So on this first call to the function we've passed in Boca as the team and 2 as the number of goals. And right now, there's nothing in our dictionary. So when evaluating line 11, we’d expect PyScripter to jump down to line 19 because the team does not have a key in the dictionary yet.  So I'll see what happens by clicking on the Step button here.  And we see that we do in fact jump down to line 19.  And if I step again, I will be adding that team to the dictionary.  I can check on that; note that it's jumped out of the function because that was the last line of the function.  It's jumped back to the main body of the script, which is another call to the checkGoals function, this time for the losing team.  While we're paused here, if I scroll down in the list of locals I can find the maxGoalsDictionary and I can see that it now has an entry for Boca with a value of 2.      

If I hit Resume again, it will run to the breakpoint again.  So now it's dealing with data for the losing team from that first match -- Independiente with 1 goal.  Again, because Independiente doesn't have a key in the dictionary, we would expect the script to jump down to line 19, and indeed that is what happens.  So when I hit Step again, it's going to add an entry to the dictionary for that second team.  

I’m going to hit the Resume button a couple more times for that second game, adding two more teams to the dictionary. Now I’m going to pause as we’ve hit a line where we have teams that we have encountered before. On this current call to the function, the team is River, and they scored 2 goals in this particular match. Looking at the dictionary, we can see that River's current maximum is 0. So, in this case we would expect on line 11 to find that yes, the team is already in the dictionary and so we'd expect it to jump down to line 13 instead of line 19.  And that's what happens.  So now we're going to check -- Is the value in the goals variable greater than their current maximum, which is 0. And indeed it is, so we’ll execute line 14, updating the dictionary with a new maximum goals for that team.

And we could continue stepping through the script, but hopefully, you get the feel for how the logic plays out now.

So as you're working on Project 4, and you're working with dictionaries in this manner, it's a good idea to keep the debugger open and watch what's happening, and you should be able to tell if your dictionary is being updated in the way that you expect.

So to finish out this script, once we have our dictionary all built, after this for loop is finished, then we're going to loop through the dictionary and print each key and each value. And that can be done using a simple for loop, like in line 56. In this case, the variable key represents a key in the dictionary. And in line 57, we print out that key, we print a colon and a space, and then we print the associated value with that key. If you want to pull a value out of a dictionary, you use square brackets, and you pass in the key name. And so, that's what we're doing there. So running this all the way through should produce a printout in the Python Interpreter of the different teams, as well as the maximum number of goals found for each.

Credit: S. Quinn and J. Detwiler © Penn State is licensed under CC BY-NC-SA 4.0.