GEOG 489
Advanced Python Programming for GIS

1.3.4 String concatenation vs. format

PrintPrint

In GEOG 485, we used the + operator for string concatenation to produce strings from multiple components to then print them out or use them in some other way, as in the following two examples:

print('The feature class contains ' + str(n) + ' point features.') 

queryString = '"'+ fieldName+ '" = ' + "'" + countryName + "'" 

An alternative to this approach using string concatenation is to use the string method format(…). When this method is invoked for a particular string, the string content is interpreted as a template in which parts surrounded by curly brackets {…} should be replaced by the variables given as parameters to the method. Here is how the two examples from above would look in this approach:

print('The feature class contains {0} point features.'.format(n) ) 

queryString = '"{0}" = \'{1}\''.format(fieldName, countryName) 

In both examples, we have a string literal '….' and then directly call the format(…) method for this string literal to give us a new string in which the occurrences of {…} have been replaced. In the simple form {i} used here, each occurrence of this pattern will be replaced by the i-th parameter given to format(…). In the second example, {0} will be replaced by the value of variable fieldName and {1} will be replaced by variable countryName. Please note that the second example will also use \' to produce the single quotes so that the entire template could be written as a single string. The numbers within the curly brackets can also be omitted if the parameters should be inserted into the string in the order in which they appear.

The main advantages of using format(…) are that the string can be a bit easier to produce and read as in particular in the second example, and that we don’t have to explicitly convert all non-string variables to strings with str(…). In addition, format allows us to include information about how the values of the variables should be formatted. By using {i:n}, we say that the value of the i-th variable should be expanded to n characters if it’s less than that. For strings, this will by default be done by adding spaces after the actual string content, while for numbers, spaces will be added before the actual string representation of the number. In addition, for numbers, we can also specify the number d of decimal digits that should be displayed by using the pattern {i:n.df}. The following example shows how this can be used to produce some well-formatted list output:

items = [('Maple trees', 45.232 ),  ('Pine trees', 30.213 ), ('Oak trees', 24.331)]

for i in items: 

    '{0:20} {1:3.2f}%'.format(i[0], i[1]) 

Output:

Maple trees                          45.23% 
Pine trees                           30.21% 
Oak trees                            24.33% 

The pattern {0:20} is used here to always fill up the names of the tree species in the list with spaces to get 20 characters. Then the pattern {1:3.2f} is used to have the percentage numbers displayed as three characters before the decimal point and two digits after. As a result, the numbers line up perfectly.

The format method can do a few more things, but we are not going to go into further details here. Check out this page about formatted output if you would like to learn more about this.