Chapter 3: Can I Map That? Maps to Depict Anything in Our World

Overview

Maps are both the raw material and the product of geographic information systems (GIS). All maps represent features and characteristics of locations, and that representation depends upon data relevant at a particular time. All maps are also selective; they do not show us everything about the place depicted; they show only the particular features and characteristics that their maker decided to include. Maps are often categorized into reference or thematic maps based upon the producer’s decision about what to include and the expectations about how the map will be used. The prototypical reference map depicts the location of “things” that are usually visible in the world; examples include road maps and topographic maps (depicting terrain). The U.S. Geological Survey (USGS) website below (Figure 3.1) provides examples of the standard topographic map produced today along with other example reference maps and a wide range of other information (see: nationalmap.gov/ustopo).

A topographical map for the Old Lyme Quadrangle, Conneticut

Figure 3.1: USGS National Map website with example topographic map shown.

Credit: USGS.

Thematic maps, in contrast, typically depict “themes.” They generally are more abstract, involving more processing and interpretation of data, and often depict concepts that are not directly visible; examples include maps of income, health, climate, or ecological diversity. There is no clear-cut line between reference and thematic maps, but the categories are useful to recognize because they relate directly to how the maps are intended to be used and to decisions that their cartographers have made in the process of shrinking and abstracting aspects of the world to generate the map.

For example, with a highway map (another example of a typical reference map), we expect the cartographer to take great care in accurately depicting road location, since the map’s main purpose is to act as a reference to the road network. In contrast, on a thematic map of U.S. unemployment rates focused on those rates, the base information such as state boundaries can be quite abstract without impeding our ability to understand the map. In Mapping it Out: Expository cartography for the humanities and social sciences, Mark Monmonier proposed the U.S. visibility map (Figure 3.2), adjusting the areas and shape of each state in order to help map users see all states, especially smaller states such as Rhode Island.

Figure 3.2: Visibility base map for United States (left) and electric vehicle charging stations per 100 square miles for each state (right). Note particularly how much more visible the small states on the eastern seaboard are.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Derived using a visibility map boundary file courtesy of Mark Monmonier and data from U.S. Department of Energy.

A flat sheet of paper is an imperfect but useful analog for geographic space. Notwithstanding the intricacies of spherical coordinate systems and map projections (see chapter 2.3 and 2.4), it is a fairly straightforward matter to plot points that stand for locations on the globe. Representing the attributes of locations on maps is sometimes not so straightforward, however. Abstract graphic symbols must depict, with minimum ambiguity, the quantities and qualities of the locations they represent. Over more than a century, with particular attention to thematic maps, cartographers have adopted and tested map symbolization principles through which geographic data are transformed into useful information. These principles focus on how color, size, shape, and other components of map symbols are used to represent characteristics of the geographic data depicted. As one example, the map below (Figure 3.3) uses variations in color to represent geographic differences in population change over a decade in the U.S.

Figure 3.3: Population change in the United States, by county, in percentage change from 1990 to 2000. The largest percentage decreases are shown in dark blue and are concentrated in the great plains and the largest increases are shown in dark red; these tend to be in southern and mountain west counties.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; After figure in Chapter 3, DiBiase, 2012. (Data from 1990 & 2000 decennial censuses).

The map above makes it easy to see where the U.S. population changed, by county, from 1990 to 2000 as well as where there was little change. To gain a sense of the power of thematic maps in transforming data into information, we need only to compare the map above to a list of population change rates for the more than 3,000 counties of the U.S [1]. that it is based upon. The thematic map reveals geographic patterns that would be virtually impossible to recognize from the table.

Maps of people, like the one above, are just one example of the nearly infinite variety of thematic maps that can be generated from today’s geographic data. This chapter will introduce the “cartographic process” through which maps are generated and then examine thematic maps specifically through exploration of diverse examples and introduce the most common (and a few uncommon) thematic mapping methods and how to interpret them.

Objectives

Students who successfully complete Chapter 3 should be able to:

understand the core components of the cartographic process;
understand the basic “graphic variables” of map symbolization and how they are used;
recognize thematic maps of different types, identify their purpose, and interpret maps within each type;
understand the data requirements of different thematic map types and recognize maps that depict data in inappropriate or otherwise misleading ways;
understand the implications of data categorization for what maps show (and do not show) about the phenomenon in the world that the map and data behind it represent;
select the most appropriate map type to represent a given set of data.

Chapter lead author: Jennifer M. Smith.
See citation to the full text and its precursor below; portions drawn directly from that text.

3.1 The Cartographic Process

Today, maps can be produced easily through a wide range of online tools by anyone with access to the Internet. Maps used in most activities (from urban planning, through geological exploration or environmental management, to trip planning and navigation), however, are still typically produced by professionals with expertise in mapping or in the phenomena being depicted on the maps. The academic and professional field that focuses on mapping is called “cartography.” Cartography has been defined by the International Cartographic Association as “the discipline dealing with the conception, production, dissemination and study of maps.” One useful conceptualization of cartography is as a process that links map makers, map users, the environment mapped, and the map itself. One characterization of this process is depicted in Figure 3.4 below.

The Cartographic Process as described in paragrapgh below.

Figure 3.4: The Cartographic Process.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Redesigned after lecture slide provided by Barbara Buttenfield, University of Colorado, Department of Geography.

The cartographic process is a cycle that begins with a real or imagined environment. As map makers collect data from the environment (through technology and/or remote sensing), they use their perception to detect patterns and subsequently prepare the data for map creation (i.e., they think about the data and its patterns as well as how to best visualize them on a map). Next, the map maker uses the data and attempts to signify it visually on a map (encoding), applying generalization, symbolization, and production methods that will (hopefully) lead to a depiction that can be interpreted by the map user in the way the map maker intended (its purpose). Next, the map user reads, analyzes, and interprets the map by decoding the symbols and recognizing patterns. Finally, users make decisions and take action based upon what they find in the map. Through their provision of a viewpoint on the world, maps influence our spatial behavior and spatial preferences and shape how we view the environment.

In the cartographic process as outlined above, the fundamental component in generating a map to depict the environment is itself a process – the process of map abstraction. This is the topic we discuss next.

Practice Quiz

Registered Penn State students should return now to the Chapter 3 folder in Canvas to take a self-assessment quiz about the Introduction.

You may take practice quizzes as many times as you wish. They are not scored and do not affect your grade in any way.

3.1.1 Map Abstraction

It has become possible to map the world on the head of a pin, or even a smaller space, as shown here: Art of Science: World on the Head of a Pin [2], but, most details get left out. Even to achieve a screen-sized map of the world on your computer, map abstraction is fundamental to representing entities in a legible manner. The process of map abstraction includes at least five major (interdependent) steps: (a) selection, (b) classification, (c) simplification, (d) exaggeration, and (e) symbolization (Muehrcke and Muehrcke, 1992).

3.1.1.1 Selection

Depending on a map’s purpose, cartographers (map makers) select what information to include and what information to leave out. As Phillip Muehrcke (an Emeritus Professor of Geography from the University of Wisconsin) details, the cartographer must answer four questions: Where? When? What? Why? As an example (Figure 3.5), a cartographer can create a map of San Diego (where) showing current (when) traffic patterns (what) so that an ambulance can take the fastest route to an emergency (why).

Screenshot of San Diego Real-Time Traffic Application. Shows current speeds on each road

Figure 3.5: Screenshot of San Diego Real-Time Traffic Application; to try out the map, see: Dot.ca.gov San Diego Map [3].

Credit: California Department of Transportation.

The map in Figure 3.5 shows how a cartographer selected specific highways to include along with a few other features; these other features include a very generalized representation of the terrain, a few major rivers and lakes, and an indication of the area included in each of several communities (in pastel colors). The objective is to help drivers pick efficient routes by depicting the highways and whether traffic is moving quickly (green) or stalled (red). Other information is kept to a minimum and visually pushed to the background; that extra information is included to provide context for the primary focus (the highways and traffic on them).

3.1.1.2 Classification

Classification is the grouping of things into categories, or classes. By grouping attributes into a few discernible classes, new visual patterns in the data can emerge and the map becomes more legible. In the example above, the highways are classified into those without traffic detectors (gray) and those with traffic detectors (in color) and furthermore, within the latter, into slow (red), intermediate (yellow), and fast (green) travel conditions. There are many kinds of data classification used on maps; we will focus specifically on classification of numerical map data in more detail later on in the chapter. As a preview of some of the things map readers must consider about classification, the example below shows one dataset for the rate of prostate cancer by county in Pennsylvania mapped using a different number of classes. As you can see, different patterns emerge depending upon how many classes the cartographer chooses to visualize. One must be critical when looking at maps because changing the map classification can change what appears to be true. In How to Lie With Maps, Mark Monmonier discusses how mapmakers intentionally and unintentionally lie through techniques such as map classification, among others.

Rate of Prostate Cancer in PA maps. Map on right with 5 classes is much more specific

Figure 3.6: Incidence rate of prostate cancer per 100,000 persons per county in Pennsylvania, visualized using three classes (left) and five classes (right).

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Redesigned after PA Cancer Atlas from Penn State University GeoVISTA Center.

3.1.1.3 Simplification

Cartographers also need to simplify the features on a map beyond the tasks of feature type selection and feature classification in order to make a map more intelligible. This includes choosing to delete, smooth, typify, and aggregate entities within feature types. In the process of deleting entities, imagine creating a map of cities for the United States. As illustrated in Figure 3.7, attempting to include every city in the U.S. would render the map illegible. Map makers must delete, for instance, cities below a certain population (as done in the map on the right) in order to better serve the purpose of the map. In this case, if the purpose was to show the most populous cities, a fixed population threshold produces a very appropriate result. If, however, the purpose was to show the most important cities in the region, then an arbitrary population threshold does not work since, for example, Salt Lake City is just as important to Utah as Phoenix is to Arizona.

Simplification of cities in the western United States. Right map only has 6 cities while left map has many more

Figure 3.7: Simplification of cities in the western United States by deleting cities with populations below 500,000.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Data from U.S.G.S. for cities and state boundaries from U.S. Census Bureau.

Smoothing is the act of eliminating unnecessary elements in the geometry of features, such as the superfluous details of a nation’s shoreline that can only be seen at a larger, zoomed-in regional scale. Typification depicts just the most typical components of the mapped feature. The visibility map above is a good example of typification in which the actual geographic shape of state boundaries is replaced with what might be considered a caricature that retains only key aspects of each state’s shape. Going beyond the simplification processes that act on one feature at a time, aggregation combines multiple features into one. Imagine a river composed of numerous meandering streams at a large scale (i.e., zoomed in), but when moving to a smaller scale (i.e., zooming out), the streams are merged into one larger river as it becomes impossible to maintain the detail. If you visit Google Maps [4] and zoom in to Harrisburg, Pennsylvania, you will find the Susquehanna River flowing through the middle of the capital. As you zoom out to a smaller scale, you will view the various smaller streams of the Susquehanna begin to collapse into a single blue line as the details of the river aggregate.

Try This: Practice Simplification in MapShaper

The purpose of this practice activity is to show you a visual example of simplification and smoothing of geographic features in the online MapShaper application.

Go to the MapShaper site at MapShaper.org [5].
Choose one of their sample layers (World Countries or Provinces of Thailand) tab and select OK.
Choose a simplification methods of your choice, and use the slider at the bottom of the page to increase the level of simplification of the mapped features.

I encourage you to experiment with the various methods and settings to see how simplification eliminates unnecessary elements as you move through different map scales.

3.1.1.4 Exaggeration

Deliberate exaggeration of map features is often performed in order to allow certain features to be seen. For instance, on a standard paper highway map of Pennsylvania (the fold-up kind you might have in the glove box of your car, thus about 3 feet across when unfolded), interstate highways are printed at roughly 0.035 inches in width. That sounds pretty small, right? But, if the width of the printed road relative to the map width was the same as the width of the actual highway relative to the width of Pennsylvania, it would mean that the the Interstate was nearly 2000 feet wide! This is a typical case of exaggeration to create an abstraction that is useful for travel.

3.1.1.5 Symbolization

In the final process of creating a map, the cartographer symbolizes the selected features on a map. These features can be symbolized in visually realistic ways, such as a river depicted by a winding blue line. But many depictions are much more abstract, such as a circle or star representing a city. Map symbols are constructed from more primitive “graphic variables, the elements that make up symbols. Below, we provide a brief overview of these core graphic variables; then we focus on how color in particular is used (or should be used).

3.1.1.5.1 Graphic Variables

Given the large variety of maps that exist, it might be surprising to learn that the visual appearance of all maps starts from a very small set of display primitives from which all those variations can be constructed. We call these primitives graphic variables because each represents a “graphic” (visible) feature of a map symbol that can be “varied.” While different cartographers have identified a slightly different set of primitives, most agree that there are somewhere between 7 and 12 of them from which all maps symbolization can be constructed. The most commonly cited primitives that can be varied for map symbols are: location, size, shape, orientation, texture, and three components of color – color hue (red, green, blue, etc.), color lightness (how light or dark the color is), color saturation (how pure the color hue is). By convention, each of these "graphic variables" is used to represent particular categories of data variation.

Common Graphic Variable Examples as discussed above

Figure 3.8: Common Graphic Variable Examples.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University.

3.1.1.5.2 Color Schemes

As you can see above, three of the graphic variables are components of color. Color is particularly important for map symbolization today since so many maps are seen online where color is always available and nearly always used. While most maps you will see use color to depict data (as well as in aesthetic ways), many maps do not use color in the most logical ways in relation to the data being depicted. Well designed maps use variations in the three color variables in ways that reflect the kinds of variations in the underlying data they represent. Below, we provide a few simple guidelines that will allow you to recognize maps that use color in logical as well as illogical ways. Recognizing the latter is particularly important so that you are not misled by maps you encounter.

To help cartographers (and others) select good colors for maps, Dr. Cynthia Brewer and Dr. Mark Harrower developed Color Brewer (ColorBrewer2.org [6]), a web app designed to help users pick colors based on data type, number of data classes, and mode of map presentation (i.e., printing, photocopying). The color schemes have been tested with users who have color deficiency (about 8% of the population; difficulty distinguishing red from green is the most common). The web app allows users to interact with a map template by changing colors, background, borders, and terrain. There are three main color scheme forms a user can choose from: sequential, diverging, and categorical. Each is appropriate for specific kinds of data as detailed below.

Sequential color schemes should be employed when data is arranged from a low to a high data value (e.g., data for mean annual income by county in Pennsylvania). This sequential scheme aligns colors from light (depicting low data values) to dark (depicting high data values) in a step-wise sequence. Sequential schemes can rely on only color lightness as shown below (Figure 3.9) at left or may add some color hue variation to enhance differences in categories will retaining the clear visual ordering as shown at right. As an example, Figure 3.10 uses a 4-class purple sequential scheme to depict Avian Influenza, with a focus on Eurasia.

Screenshots of sequential color schemes.

Figure 3.9: Screenshot of a single hue sequential color scheme for 5 classes (left) and a multi-hue sequential color scheme for 5 classes (right).

Credit: http://colorbrewer2.org [7].

Highly Pathogenic Avian Influenza in Humans with sequential color scheme to show impact. High impact in China, low in Nigeria

Figure 3.10: Reported H5N1 Cases (Avian Flu) Per Country from January 1, 2003 to December 31, 2008.

Credit: Created by Paulo Rapolo.

Diverging color schemes highlight an important midrange or critical value of ordered data as well as the maximum and minimum data values. Two contrasting dark hues converge in color lightness at the critical value. This is the scheme used for the population change map in Figure 3.3 above in which the critical dividing point is zero change.

Screenshot of a diverging color scheme for 5 classes. Blue, Light blue, yellow, orange, red

Figure 3.11: Screenshot of a diverging color scheme for 5 classes.

Credit: http://colorbrewer2.org [7].

Unlike the ordered data mentioned in the previous color schemes, qualitative color schemes are used to present categorical data, or data belonging to different categories. Different hues visually separate each of the different classes, or categories. The map in Figure 3.13 employs a qualitative color scheme of three different colors (red, blue, green) to represent different categories (coke, pop, and soda respectively).

Screenshot of a qualitative color scheme for 5 classes. Orange, Purple, Green, Blue, Red

Figure 3.12: Screenshot of a qualitative color scheme for 5 classes.

Credit: http://colorbrewer2.org [7].

Popular term (coke, pop, or soda) by majority for each of the contiguous states. North uses Pop, South uses Coke and the Coasts use Sode

Figure 3.13: Popular term (coke, pop, or soda) by majority for each of the contiguous states.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Data from www.popvssoda.com [8].

Practice Quiz

Registered Penn State students should return now to the Chapter 3 folder in Canvas to take a self-assessment quiz about Map Abstraction.

You may take practice quizzes as many times as you wish. They are not scored and do not affect your grade in any way.

3.2 Thematic Maps

As introduced above, unlike reference maps, thematic maps are usually made with a single purpose in mind. Often, that purpose has to do with revealing the spatial distribution of one or two attribute data sets (e.g., to help readers understand changing U.S. demographics as with the population change map). Alternatively, thematic maps can have a decision-making purpose (e.g., to help users make travel decisions as with the real-time traffic map).

In the rest of this chapter, we will explore different types of thematic maps and consider which type of map is conventionally used for different types of data and different use goals. A primary distinction here is between maps that depict categorical (qualitative) data and those that depict numerical (quantitative) data.

3.2.1 Mapping Categorical Data

As mentioned in the section on color schemes, categorical data are data that can be assigned to distinct non-numerical categories. For example, the category of a beach could not be described as two times the value of a wetland; it is different in kind rather than amount. In mapping categorical data, cartographers often focus on displaying the different categories or classes through shape or color hue. The CrimeViz map application (CrimeViz [9]) developed in the GeoVISTA Center at Penn State visualizes violent crimes reported from the District of Columbia Data Catalog (DC Data Catalog [10]). Every crime location is displayed as a circular point, where each crime category is differentiated through hue (arson: orange, homicide: purple, sexual abuse: blue). This interactive map application allows map users to explore and find new patterns across space and time.

Screenshot of the features of CrimeViz: Map Panel, Data Layers Panel and Temporal Panel. Crime locations

Figure 3.14: Screenshot of the features of CrimeViz.

Credit: http://www.geovista.psu.edu/CrimeViz/ [9].

Aside from altering color to represent different categories on a map, changing the shape of a point symbol can help map users differentiate different groups. The Ushahidi (signifying “testimony” in Swahili) website [11] developed an online crowd sourcing map application [12]. Following the election in 2008, many Kenyans believed the new president manipulated votes in his favor, which led to violence throughout the country. Users of the Ushahidi website were prompted to report acts of violence in Kenya. Their map, automatically generated from the reports, displays different types of incidents by varying the shape of the point feature (fire: all categories, push pin: specific type of violence, dove: peace efforts, people: displaced people). In addition, each subcategory of violence (represented by push pins) is contrasted by differing hues (blue: riots, orange: deaths, and so on). The tools to create this mapping application have been distributed for free around the world and are now used for a wide array of crisis mapping applications. One recent example is their application to generate maps of sexual violence in Syria (Women Under Siege: Syria Crowdmap [13]); and for those who read Japanese, the tools were applied to the Japan Earthquake and subsequent nuclear disaster [14].

Screenshot of the features of Ushahidi: shape of point symbol characterize data

Figure 3.15: Screenshot of the features of Ushahidi.

Credit: http://legacy.ushahidi.com/index.asp [15].

Categorical aspects of linear features can also be visualized on a map. In the figure below, different gas pipelines owned by various companies are depicted in different color hues. The dashed pink line in the top left of the figure represents a proposed gas line from Alaska that could send up to 4.5 billion cubic feet of natural gas a day to the conterminous United States. In this map, the cartographer uses the process of map abstraction for the purpose of displaying the current and proposed gas pipeline network. First, only necessary features (pipelines, territories and major cities) are selected for display in order to produce a clean and legible map. Next, the linear pipeline network is classified into several groups based upon distinct companies. The map is simplified by visualizing only major cities important to the gas pipeline network. The width of the pipeline is constant across the entire system, exaggerating the actual width (if the width of lines represented real-world diameter of the pipes proportionally, the real pipes would be 16 miles across). Finally, the classified/categorical data (the different pipeline companies) is symbolized by different color hues to represent the qualitative difference among the categories.

Map of the Gas Pipline Network From Canada to the United States. 3 Canadian lines to 10 US lines

Figure 3.16: Map of the Gas Pipline Network From Canada to the United States.

Credit: http://www.arcticgas.gov/Moving-Alaska-gas-from-Canada-to-the-Lower-48 [16].

The maps above focus on depiction of specific discrete entities, things that have a label we use when discussing them. Categorical maps can also represent characteristics of extended areas or territories. In this case, rather than categorizing discrete entities, we categorize the characteristics of the place, and those places may or may not have precise boundaries. A prototypical example is a land use map in which all areas of the map fall into one of a set of distinct land use categories. The most common method to depict this kind of data is to fill the area with a color or a texture. Below is an example in which land use is depicted very abstractly. All places are assigned to one of only three categories: agriculture, forest, or developed.

Map of land use in the Spring Creek Watershed in central Pennsylvania. Developed land surrounded by forest

Figure 3.17: Map of land use in the Spring Creek Watershed in central Pennsylvania. In this map, “Developed” is a broad category that includes commercial, residential, and all other land uses that are not explicitly agriculture or forest.

Credit: This map was produced in Riparia, a Center in the Dept. of Geography at Penn State focused on wetlands and watershed management: http://www.wetlands.psu.edu/ [17] (map provided by Dr. Robert Brooks).

Practice Quiz

Registered Penn State students should return now to the Chapter 3 folder in Canvas to take a self-assessment quiz about Mapping Categorical Data.

You may take practice quizzes as many times as you wish. They are not scored and do not affect your grade in any way.

3.2.2 Mapping numerical data

When data are numerical, the mapping focus is typically on representing at least relative rank order among the entities depicted, with some maps trying to represent magnitudes in a direct way. A wide array of map types has been developed over the years to represent numerical data. Here, we will introduce some of the most common map types you are likely to encounter. There is a growing number of online tools that you can use to generate these common map types yourself.

We begin by introducing one of the most common thematic map types for numerical data, the choropleth map. This is followed by a brief discussion of the U.S. Census as an important source of numerical data that is depicted on choropleth thematic maps as well as on other thematic map types. We then introduce three important additional map types you are likely to encounter frequently: proportional symbol maps, dot maps, and cartograms.

Try This: Thematic Mapping of Flu Trends

Google collects certain search terms that users input because they are key indicators of flu among users. Visit Google Flu Trends [18] and explore current flu trends around the world that have been numerically classified from minimal to intense activity and mapped. Pick a country that has flu activity. Do you see any geographic patterns within the country? How does this year compare to the past?

3.2.2.1 Choropleth mapping

Choropleth maps are among the most prevalent types of thematic maps. Choropleth maps represent quantitative data that is aggregated to areas (often called “enumeration units”). The units can be countries of the world, states of a country, school districts, or any other regional division that divides the whole territory into distinct areas. The term choropleth is derived from the Greek; khōra 'region' + plēthos 'multitude' (thus, be careful not to mix up “choro”, which has no ‘l’, with the “chloro” of chlorophyll or chlorine). Choropleth maps depict quantities aggregated to their regions by filling the entire region with a shade or color. Typically, the quantities are grouped into “classes” (representing a range in data value) and a different fill is used to depict each class (see section 3.2.6 for more on data classification). The goal of choropleth maps is to depict the geographic distribution of the data magnitudes; ideally the choice of fill will communicate the range from low data magnitudes to high magnitudes through an obvious change from light to dark as in Figure 3.18 below. Choropleth maps should use either a sequential color scheme (as below) or a diverging color scheme depending upon whether there is a meaningful break point in the data from which values diverge or the data simply range from low to high (see section 2.1.5.2 above).

Hispanic population density in the U.S. by state. High density along coasts

Figure 3.18: Hispanic population density in the U.S. by state, using a single hue sequential color scheme that depicts the range of data values from low to high with light to dark color values.

Credit: Cartography by Geoff Hatchard.

To generate eye-catching maps with easily distinguishable data classes, choropleth maps often combine color hue differences with a change in color lightness (as with the yellow, through orange, to dark red scheme depicted in Figure 3.18 above). But many maps get produced without following that cartographic rule, leading to some very colorful but misleading maps as shown in the pair below.

Figure 3.19: Misleading population maps due to color choice. On the left, the data values diverge from no change to large increases and decreases, but the fact that most of the US has increases is a lot harder to determine from this map than from Figure 3.3, and regional clusters are harder to recognize as well. On the right, the data are ordered, but the color scheme applied is not visually sequential, so geographic patterns are very hard to identify.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Data from U.S. Census Bureau.

Choropleth maps are most appropriate for representing derived quantities, as represented in Figure 3.18 above. Derived quantities relate a data value to some reference value. Examples include density, average, rate, and percent. A density is a count divided by the area of the geographic unit to which the count was aggregated (e.g., the total population divided by the number of square kilometers to produce population/square mile, as in Figure 3.18). An average is a measure of central tendency, specifically the mean value calculated as a total amount divided by the number of entities producing the amount (e.g., the average income for a county calculated by totaling the income of all people in the country and dividing by the number of people). A rate is a quantity that tells us how frequently something occurs, a value compared to a standard value (e.g., Bradford County, PA had a rate of 45.1/100,000 deaths due to colorectal cancer among women over the period of 1994-2002). A percent is the proportion of a total (and can range from 0-100%). While choropleth maps are best for these derived quantities, you will also encounter choropleth maps used for counts (e.g., the number of crimes committed, votes cast in an election, etc.). When you do, it is important to read the map with caution because big regions are likely to have high totals just because they are big.

Total population count by state and population density by state. General trend greater population smaller population per sq mile

Figure 3.20: Total population count by state (left) and population density by state (right).

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Data from U.S. Census Bureau.

3.2.2.2 Census Data

Some of the richest sources of attribute data for thematic mapping, particularly for choropleth maps, are national censuses. In the United States, a periodic count of the entire population is required by the U.S. Constitution. Article 1, Section 2, ratified in 1787, states (in the last paragraph of the section shown below) that “Representatives and direct taxes shall be apportioned among the several states which may be included within this union, according to their respective numbers ... The actual Enumeration shall be made [every] ten years, in such manner as [the Congress] shall by law direct." The U.S. Census Bureau is the government agency charged with carrying out the decennial census.

Figure 3.21: A portion of the Constitution of the United States of America (preamble and first three paragraphs of Article 1).

Credit: Obtained from: http://www.archives.gov/exhibits/charters/charters_downloads.html [19].

The results of the U.S. decennial census determine states' portions of the 435 total seats in the U.S. House of Representatives. The thematic map below (Figure 3.22) shows states that lost and gained seats as a result of the reapportionment that followed the 2000 census. This map, focused on the U.S. by state, is a variant on a choropleth map. Rather than using color fill to depict quantity, color depicts only change and its direction, red for a loss in number of Congressional seats, gray for no change, and blue for a gain in number of Congressional seats. Numbers are then used as symbols to indicate amount of change (small -1 or +1 for a change of 1 seat and larger -2 or +2 for a change of two seats). This scaling of numbers is an example of the more general application of “size” as a graphic variable to produce “proportional symbols” – the topic we cover in detail in the section on proportional symbol mapping below.

Reapportionment of the U.S. House of Representatives in 2000.

Figure 3.22: Reapportionment of the U.S. House of Representatives as a result of the 2000 census.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; After figure in Chapter 3, DiBiase, 2012. (Data from U.S. Census Bureau, generalized in MapShaper and Alaska and Hawaii boundaries from www.naturalearthdata.com [20]).

Congressional voting district boundaries must be redrawn within the states that gained and lost seats, a process called redistricting. Constitutional rules and legal precedents require that voting districts contain equal populations (within about 1 percent). In addition, districts must be drawn so as to provide equal opportunities for representation of racial and ethnic groups that have been discriminated against in the past. Further, each state is allowed to create its own parameters for meeting the equal opportunities constraint. In Pennsylvania (and other states), geographic compactness has been used as one of several factors. Article II, Section 16 of the Pennsylvania Constitution says:

§ 16. Legislative districts.

The Commonwealth shall be divided into 50 senatorial and 203 representative districts, which shall be composed of compact and contiguous territory as nearly equal in population as practicable. Each senatorial district shall elect one Senator, and each representative district one Representative. Unless absolutely necessary no county, city, incorporated town, borough, township or ward shall be divided in forming either a senatorial or representative district. (Apr. 23, 1968, P.L.App.3, Prop. No.1). Source: http://www.legis.state.pa.us/WU01/LI/LI/CT/HTM/00/00.002..HTM [21]

Whether districts determined each decade actually meet these guidelines is typically a contentious issue and often results in legal challenges. Below, the Congressional District map for PA that defines the boundaries of districts for the 112^th Congress illustrates how irregular districts can be. District 12 has a particularly interesting shape.

Congressional districts of Pennsylvania map.

Figure 3.23: Congressional districts of Pennsylvania.

Credit: The National Atlas.

Beyond the role of the census of population in determining the number of representatives per state (thus in providing the data input to reapportionment and redistricting), the Census Bureau's mandate is to provide the population data needed to support governmental operations, more broadly including decisions on allocation of federal expenditures. Its broader mission includes being "the preeminent collector and provider of timely, relevant, and quality data about the people and economy of the United States". To fulfill this mission, the Census Bureau needs to count more than just numbers of people, and it does. We will discuss this in more detail later (in section 3.3, Thinking about aggregated data: Enumeration versus samples).

3.2.2.3 Proportional Symbol Mapping

Besides reapportionment and redistricting, U.S. Census counts also affect the flow of billions of dollars of federal expenditures, including contracts and federal aid, to states and municipalities. In 2011, for example, some $486 billion of Medicaid funds were distributed according to a formula that compared state and national per capita income. $93 billion worth of highway planning and construction funds were allotted to states according to their shares of urban and rural population. And $120 billion of Unemployment Compensation was distributed from the Federal level. The thematic maps below (using historical data from 1995) illustrate the strong relationship between population counts and the distribution of federal tax dollars using proportional symbols (symbols in which the graphic variable of size is used to depict data magnitude).

Figure 3.24. Population and federal expenditures, by state, 1995.

Credit: Cartography by Thad Lenker. Data from U.S. Census Bureau, Federal Expenditures by State, http://www.census.gov/prod/2/gov/fes95rv.pdf [22].

There are two types of point features that are typically depicted with proportional symbols: features for which the data represents a geographic position directly (e.g., gallons of oil from individual oil wells), and features that are geographic areas to which data are aggregated and the data magnitudes are assigned to a representative point within the area (e.g., the geographic centroid of a state as in the examples above). In either case, the area of the symbol is scaled to represent the data magnitude, sometimes with a bit of exaggeration to adjust for a general tendency of human vision to underestimate differences in area. A variant on this direct data-to-symbol scaling groups values into categories first, then scales the symbol to represent the mean for the category, assigning a symbol to each place to represent the category range that the mean for the place falls within (see Figure 3.25 below).

Figure 3.25: Unemployment Percentages in 2000 in the United States, with each circle representing a category with the percentage range specified in the legend at the right.

Credit: Cartography by Jennifer M. Smith.

One important characteristic of proportional symbols is that they can easily be designed to represent more than one data value per location. Among the most common example is a “pie chart map” in which a circle is scaled proportionally to some total, and the size of wedges within the circle is scaled to depict a proportion of a total for two or more sub-categories. The map below uses circle size to depict population totals in each state, and the pie slices then depict the proportion of that total who identify as Hispanic compared to those who are non-Hispanic.

Rate percents of Hispanic population as percent of total population of each state

Figure 3.26: A "pie chart " map that depicts rate percents of Hispanic population as a percent of total population.

Credit: Cartography by Geoff Hatchard.

Practice Quiz

Registered Penn State students should return now to the Chapter 3 folder in Canvas to take a self-assessment quiz about Choropleth Mapping, Census Data, and Proportional Symbol Mapping.

You may take practice quizzes as many times as you wish. They are not scored and do not affect your grade in any way.

3.2.2.4 Dot Mapping

For data that represent an area, proportional symbols are a fairly extreme abstraction. They provide a very simple overview of data magnitudes geographically but hide any geographic variation that might occur inside the enumeration units to which the data are aggregated. An alternative is the dot map. Dot maps depict magnitude by frequency rather than the size of symbol and add the depiction of geographic distribution by use of the graphic variable of location. Specifically, dot maps assign one to many dots per enumeration area to represent a specific count in each area. The difference between a dot map and a simple map of point features is that each dot represents more than one entity and the locations are representative of the distribution rather than being exact locations. Specifically, dots that represent some count are placed within enumeration units to represent generally where the feature or attribute occurs.

In the example below, the dot map depicts the size of the Hispanic population by the number of dots per state. Each dot represents 100,000 people in this case, and the general geographic distribution of the Hispanic population within the state is signified by the position of the dots. Not surprisingly, dot maps can vary substantially in how well the distribution of dots on the map represents the actual distribution of the phenomena in the world. Cartographers typically use secondary sources of information to help them decide on the appropriate locations for the dots (e.g., land use maps, satellite images, or statistics collected for smaller geographic units like counties). But, the position of dots usually is based on an educated estimate of distribution rather than on any direct measurement of where the people (in this case) or automobiles or bushels of wheat (or the many other kinds of things we can count) actually are.

A dot density map that depicts count data of hispanic population in the US.

Figure 3.28: A "dot density" map that depicts count data.

Credit: Cartography by Geoff Hatchard.

3.2.2.5 Cartograms

A cartogram can be considered a special case of proportional symbol mapping. But, in this case, the “symbol” that is scaled in proportion to a data magnitude is the geographic area for which data are aggregated. Cartograms are unusual enough that they attract viewer attention, making them a popular mapping method with the media, particularly during election years. Their primary weakness (in addition to distorting geography so that no standard measurements such as distance among places are accurate), is that they cannot be interpreted correctly unless the map reader knows the actual geographic shapes of the map units so that sizes can be related to the places they represent.

The map below shows the results of the 2008 Presidential election, with a red state signifying a majority of votes for John McCain, the republican candidate, and blue states a majority for Barack Obama, the democratic candidate. This cartogram scales the areas of each shape to represent its respective total population, visually showing how the majority of the United States voted.

Figure 3.29: Cartogram of election results with red signifying a Republican majority state and blue a Democratic majority state.

Credit: Mark Newman at the University of Michigan, http://www-personal.umich.edu/~mejn/election/2008/ [23].

The following maps illustrate the power that some cartograms can have in helping users visually comprehend a phenomenon. While the map on the left depicts the majority vote results by county (with a vast majority of counties for the Republican candidate), the cartogram on the right shows the areas again depicted by population (this time with the country rather than state level data), revealing the larger number of Democratic support. The map on the left gives a distorted view (even though it does not look distorted) because a majority of counties won by the Republican candidate were low in population and many were large in area.

Election results by county.

Figure 3.30: Election results by county with red signifying a Republican majority and blue a Democratic majority (left) and cartogram skewing the counties by their respective populations (right).

Credit: Mark Newman at the University of Michigan, http://www-personal.umich.edu/~mejn/election/2008/ [23].

For more election cartogram examples, visit University of Michigan 2008 election site [23].

Try This: Practice Identifying Mapping Techniques

Visit the National Geographic Earthpulse map [24]. On the left hand side, you will find numerous check boxes for different thematic maps. Choose two thematic maps and identify at least two cartographic techniques (any that have been discussed in the chapter) the cartographer used when creating this map. For instance, in the map above (Figure 3.30), the cartographer used a qualitative color scheme (blue and red) on a choropleth map to show different categories (democratic or republication majority vote) for each U.S. state.

3.2.2.6 Numerical Data Classification

As discussed above (and in Chapter 1), all maps are abstractions. This means that they depict only selected information, but also that the information selected must be generalized due to the limits of display resolution, comparable limits of human visual acuity, and especially the limits imposed by the costs of collecting and processing detailed data. What we have not previously considered is that generalization is not only necessary, it is sometimes beneficial; it can make complex information understandable.

Consider a simple example. The graph below (Figure 3.31) shows the percent of people who prefer the term “pop” (not soda or coke) for each state. Categories along the x axis of the graph represent each of the 50 unique percentage values (two of the states had exactly the same rate). Categories along the y axis are the numbers of states associated with each rate. As you can see, it's difficult to discern a pattern in these data; it appears that there is no pattern.

Figure 3.31: Unique percentage values for people who use the term “pop” by state.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Data from http://www.popvssoda.com/ [25].

The following graph (Figure 3.32) shows exactly the same data set, only grouped into 10 classes with equal 10% ranges). It's much easier to discern patterns and outliers in the classified data than in the unclassified data. Notice that people in a large number of states (23) do not really prefer the term “pop” as they are distributed around 0 to 10 percent of users who favor that term. There are no states at the other extreme (91-100%), but a few states whose vast majority (81-90% of their population) prefer the term pop. Ignoring the many 0-10% states where pop is rarely used, the most common states are ones in which about 2/3 favor the term; looking back to Figure 3.13, these are primarily northern states, including Pennsylvania. All of these variations in the information are obscured in the unclassified data.

Figure 3.32: Classed percentages of people who use the term “pop” by state.

Credit: Jennifer M. Smith, Department of Geography, The Pennsylvania State University; Data from http://www.popvssoda.com/ [25].

As shown above, data classification is a generalization process that can make data easier to interpret. Classification into a small number of ranges, however, gives up some details in exchange for the clearer picture, and there are multiple choices of methods to classify data for mapping. If a classification scheme is chosen and applied skillfully, it can help reveal patterns and anomalies that otherwise might be obscured (as shown above). By the same token, a poorly-chosen classification scheme may hide meaningful patterns. The appearance of a thematic map, and sometimes conclusions drawn from it, may vary substantially depending on the data classification scheme used. Thus, it is important to understand the choices that might be made, whether you are creating a map or interpreting one created by someone else.

Many different systematic classification schemes have been developed. Some produce mathematically "optimal" classes for unique data sets, maximizing the difference between classes and minimizing differences within classes. Since optimizing schemes produce unique solutions, however, they are not the best choice when several maps need to be compared. For this, data classification schemes that treat every data set alike are preferred.

Figure 3.33: Portion of the ArcMap classification dialog box highlighting the schemes supported in ArcMap 8.2.

Credit: Department of Geography, The Pennsylvania State University.

Two commonly used classification schemes are quantiles and equal intervals. The following two graphs illustrate the differences.

Figure 3.34: County population change rates divided into five quantile categories.

Credit: Department of Geography, The Pennsylvania State University.

The graph above groups the Pennsylvania county population change data into five classes, each of which contains the same number of counties (in this case, approximately 20 percent of the total in each). The quantiles scheme accomplishes this by varying the width, or range, of each class. Quantile is a general label for any grouping of rank ordered data into an equal number of entities; quantiles with specific numbers of groups go by their own unique labels ("quartiles" and "quintiles," for example, are instances of quantile classifications that group data into four and five classes respectively). The figure below, then, is an example of quintiles.

Figure 3.35: County population change rates divided into five equal interval categories.

Credit: Department of Geography, The Pennsylvania State University.

In the second graph, the data range of each class is equivalent (8.5 percentage points). Consequently, the number of counties in each equal interval class varies.

The five quantile classes mapped on Pennsylvania.

Figure 3.36: The five quantile classes mapped.

Credit: Department of Geography, The Pennsylvania State University.

The five equal interval classes mapped on Pennsylvania.

Figure 3.37: The five equal interval classes mapped.

Credit: Department of Geography, The Pennsylvania State University.

As you can see, the effect of the two different classification schemes on the appearance of the two choropleth maps above is dramatic. The quantiles scheme is often preferred because it prevents the clumping of observations into a few categories shown in the equal intervals map. Conversely, the equal interval map reveals two outlier counties that are obscured in the quantiles map. Due to the potentially extreme differences in visual appearance, it is often useful to compare the maps produced by several different map classifications. Patterns that persist through changes in classification schemes are likely to be more conclusive evidence than patterns that shift. Patterns that show up with only one scheme may be important, but require special scrutiny (and an understanding of how the scheme works) to evaluate.

Practice Quiz

Registered Penn State students should return now to the Chapter 3 folder in Canvas to take a self-assessment quiz about Dot Mapping, Cartograms, and Data Classification.

You may take practice quizzes as many times as you wish. They are not scored and do not affect your grade in any way.

3.2.3 Thinking about aggregated data: Enumeration versus samples

Quantitative data of the kinds depicted by the maps detailed in the previous section come from a diverse array of sources. In the U.S., one of the most important sources is the U.S. Bureau of the Census (discussed briefly above). Here we focus in on one important distinction in data collected by the Census and by other organizations, a distinction between complete enumeration (counting every entity) and sampling.

Sixteen U.S. Marshals and 650 assistants conducted the first U.S. census in 1791. They counted some 3.9 million individuals, although as then-Secretary of State Thomas Jefferson reported to President George Washington, the official number understated the actual population by at least 2.5 percent (Roberts, 1994). By 1960, when the U.S. population had reached 179 million, it was no longer practical to have a census taker visit every household. The Census Bureau then began to distribute questionnaires by mail. Of the 116 million households to which questionnaires were sent in 2000, 72 percent responded by mail. A mostly-temporary staff of over 800,000 was needed to visit the remaining households, and to produce the final count of 281,421,906. Using statistically reliable estimates produced from exhaustive follow-up surveys, the Bureau's permanent staff determined that the final count was accurate to within 1.6 percent of the actual number (although the count was less accurate for young and minority residences than it was for older and white residents). It was the largest and most accurate census to that time. (Interestingly, Congress insists that the original enumeration or "head count" be used as the official population count, even though the estimate calculated from samples by Census Bureau statisticians is demonstrably more accurate.) As of this writing, some aspects of reporting from the decennial census of 2010 are still underway. Like 2000, the mail-in response rate was 72 percent. The official 2010 census count, by state, was delivered to the U.S. Congress on December 21, 2010 (10 days prior to the mandated deadline). The total count for the U.S. was 308,745,538, a 9.7% increase over 2000.

In the first census, in 1791, census takers asked relatively few questions. They wanted to know the numbers of free persons, slaves, and free males over age 16, as well as the sex and race of each individual. (You can view replicas of historical census survey forms at Ancestry.com [26]) As the U.S. population has grown, and as its economy and government have expanded, the amount and variety of data collected has expanded accordingly. In the 2000 census, all 116 million U.S. households were asked six population questions (names, telephone numbers, sex, age and date of birth, Hispanic origin, and race), and one housing question (whether the residence is owned or rented). In addition, a statistical sample of one in six households received a "long form" that asked 46 more questions, including detailed housing characteristics, expenses, citizenship, military service, health problems, employment status, place of work, commuting, and income. From the sampled data the Census Bureau produced estimated data on all these variables for the entire population.

In the parlance of the Census Bureau, data associated with questions asked of all households are called 100% data and data estimated from samples are called sample data. Both types of data are aggregated by various enumeration areas, including census block, block group, tract, place, county, and state (see the illustration below). Through 2000, the Census Bureau distributes the 100% data in a package called the "Summary File 1" (SF1) and the sample data as "Summary File 3" (SF3). In 2005, the Bureau launched a new project called American Community Survey that surveys a representative sample of households on an ongoing basis. Every month, one household out of every 480 in each county or equivalent area receives a survey similar to the old "long form." Annual or semi-annual estimates produced from American Community Survey samples replaced the SF3 data product in 2010.

To protect respondents' confidentiality, as well as to make the data most useful to legislators, the Census Bureau aggregates the data it collects from household surveys to several different types of geographic areas. SF1 data, for instance, are reported at the block or tract level. There were about 8.5 million census blocks in 2000. By definition, census blocks are bounded on all sides by streets, streams, or political boundaries. Census tracts are larger areas that have between 2,500 and 8,000 residents. When first delineated, tracts were relatively homogeneous with respect to population characteristics, economic status, and living conditions. A typical census tract consists of about five or six sub-areas called block groups. As the name implies, block groups are composed of several census blocks. American Community Survey estimates, like the SF3 data that preceded them, are reported at the block group level or higher. Figure 3.38 details the many geographic unit types that are used to organize data and how they relate. The unit types down the center of the diagram nest, with each higher type composed of some number of the lower type as outlined above for blocks, block groups, and census tracts.

Relationships among the various census geographies including the nation, states. traffic, and school etc.

Figure 3.38: Relationships among the various census geographies.

Credit: (U.S. Census Bureau, American FactFinder, 2005, http://factfinder.census.gov/jsp/saff/SAFFInfo.jsp?_pageId=gn7_maps [27]. An updated source for the diagram can be found at http://factfinder2.census.gov/faces/nav/jsf/pages/using_factfinder5.xhtml [28]).

Try This: Acquiring U.S. Census Data via the World Wide Web

The purpose of this practice activity is to guide you through the process of finding and acquiring 2000 census data from the U.S. Census Bureau data via the Web. Your objective is to look up the total population of each county in your home state (or an adopted state of the U.S.).

Go to the U.S. Census Bureau site at Census.gov [29].
At the Census Bureau home page, hover your mouse cursor over the Data tab and select American FactFinder. American FactFinder is the Census Bureau's primary medium for distributing census data to the public.
Click the SEARCH button, and take note of the three steps featured in the yellow rectangle. That’s what we are about in this exercise.
Click the Topics search option box. In the Select Topics overlay window, expand the People list. Next expand the Basic Count/Estimate list. Then choose Population Total. Note that a Population Total entry is placed in the Your Selections box in the upper left, and it disappears from the Basic Count/Estimate list.
Close the Select Topics window.
The list of datasets in the resulting Search Results window is for the entire United States. We want to narrow the search to county-level data for your home or adopted state.
Click the Geographies search options box. In the Select Geographies overlay window that opens, under Select a geographic type:, click County.
Next select the entry for your state from the Select a state list, and then from the Select one or more geographic areas.... list select All counties within your state> .
Last click ADD TO YOUR SELECTIONS. This will place your All Counties… choice in the Your Selections box.
Close the Select Geographies window.
The list of datasets in the Search Results window now pertains to the counties in your state. Take a few moments to review the datasets that are listed. Note that there are SF1, SF2, ACS (American Community Survey), etc., datasets, and that if you page through the list far enough you will see that data from past years is listed. We are going to focus our effort on the 2010 SF1 100% Data.
Given that our goal is to find the population of the counties in your home state, can you determine which dataset we should look at?
There is a TOTAL POPULATION entry, probably on page 2. Find it, and make certain you have located the 2010 SF1 100% Data dataset. (You can use the Narrow your search: slot above the dataset list to help narrow the search.)
Check the box for it and click View.
In the new Results window that opens, you should be able to find the population of the counties in your chosen state.
Note the row of Actions:, which includes Print and Download buttons.

I encourage you to experiment some with the American FactFinder site. Start slow, and just click the BACK TO SEARCH button, un-check the TOTAL POPULATION dataset and choose a different dataset to investigate. Registered students will need to answer a couple of quiz questions based on using this site.
Pay attention to what is in the Your Selections window. You can easily remove entries by clicking the red circle with the white X.

On the SEARCH page, with nothing in the Your Selections box, you might try typing “QT” or “GCT” in the Narrow your search: slot. QT stands for Quick Tables, which are preformatted tables that show several related themes for one or more geographic areas. GCT stands for Geographic Comparison Tables, which are the most convenient way to compare data collected for all the counties, places, or congressional districts in a state, or all the census tracts in a county.

3.2.4 Example Thematic Maps Produced at Penn State

Below you will find several thematic maps produced by graduate students or faculty in the Department of Geography at Penn State to provide an idea of the variety that exists. Thematic maps cover a virtually unlimited range of topics and goals since they can depict any “theme” that varies from place to place. Thus, the examples below and the ones to follow in the rest of the chapter provide just a hint of what is possible.

In the map below, size, or height of each column, is the key graphic variable used to represent the total number of international passenger arrivals at each airport in Canada and the United States. This is a very direct representation similar to thinking about piling up a stack of pennies, with one for every airline passenger.

International Passenger Flight Arrivals to Canada and the United States in 2007. New York and LA high

Figure 3.39: International Passenger Flight Arrivals to Canada and the United States in 2007.

Credit: Cartography by Paulo Raposo.

The Cost per gallon of Driving to or From East Lansing, Michigan.

Figure 3.40: The Cost of Driving to or From East Lansing, Michigan.

Credit: Cartography by Joshua Stevens.

Figure 3.41: A Temporal Comparison of Vehicle Emission in Los Angeles on December 1, 2011, created by Joshua Stevens. The emissions values were calculated based on current traffic speeds, the number of vehicles that had passed over a sensor within a 30-second time frame, and EPA estimates for the amount of CO2 produced by traveling vehicles. Since there are thousands of these sensors collecting data over multiple lanes and directions, the maps would be very difficult to interpret without aggregating the data. To do this, the measurements of nearby sensors were combined and averaged. Sensors were considered 'nearby' if they fell within a 1 km-wide hexagon (which is about how far a vehicle would travel based on the speeds and times used for the EPA calculations). Hexagons are a particularly convenient shape for spatial aggregation since they tessellate perfectly without gaps, allowing full spatial coverage of an area. Gaps or missing hexagons in the maps above reflect the irregular placement of the sensors and areas were no sensors were present.

Credit: Cartography by Joshua Stevens.

Global Humna-Poultry Population Co-Density. High in China and Europe

Figure 3.42: Areas of Population Co-Density for Humans and Poultry to Determine Potential Areas for Bird-Human Disease Transfer Such as Avian Flu.

Credit: Cartography by Paulo Raposo.

Figure 3.43: Map of average monthly frequency of jet contrail outbreaks over the United States during April 2000, 2001, and 2002. The brighter orange and red colors highlight areas that experience more outbreaks, specifically the Midwest in this instance.

Credit: Jase Bernhardt and Andrew Carleton.

Figure 3.44: Maps Visualizing the Travel Time to Reach a Health Facility in Niger for the Dry Season and Wet Season, created by Justine Blanford. The reason that so many fewer people are within 1-2 hours of a health facility in the wet season is that most of the roads are not paved, thus movement is much harder then.

Credit: Justine Blanford.

Try This: The Purpose of Thematic Mapping

Find a thematic map online and identify both the theme and purpose of the map.

Practice Quiz

Registered Penn State students should return now to the Chapter 3 folder in Canvas to take a self-assessment quiz about Aggregated Data.

You may take practice quizzes as many times as you wish. They are not scored and do not affect your grade in any way.

3.3 Summary

This chapter has introduced a set of key concepts that underlie how maps represent data, specifically thematic maps. Mapping as a method to represent data has been related to cartography as a professional field focused on developing the science that supports effective mapping and the practice and technology of generating maps. Emphasis has been on the many different map types, how they are created, and what types of data are best suited for each map type. The core building blocks used to create map symbolization have also been introduced.

To summarize this chapter, maps are abstractions of the real world created through systematic selection, classification, simplification, exaggeration, and symbolization. They are products of the cartographic process, a cyclic process involving data as inputs, the final map as an output, as well as the map maker and map user as the creator and consumer of the map. Thematic maps help reveal geographic patterns that are hard or impossible to find in lists of numbers that data are typically presented in. Graphic variables and color schemes form the building blocks for thematic map construction. Furthermore, different types of thematic maps are used to represent different types of data, both categorical and numerical. Count data, for instance, are conventionally portrayed with symbols that are distinct from the statistical areas they represent, because counts are independent of the sizes of those areas. Rates and densities, on the other hand, are often portrayed as choropleth maps, in which the statistical areas themselves serve as symbols whose color lightness is varied with the magnitude of the attribute data they represent. Attribute data shown on choropleth maps are usually classified. Classification schemes that facilitate comparison of map series, such as the quantiles and equal intervals schemes demonstrated in this lesson, are the most common.

Try This: The Good, The Bad, and The Ugly

Surf the Internet and find a good map, a bad map, and an ugly map. What attributes are the important ones for assigning each map to the category you put it in?

3.4 Glossary

Aggregation: The process of combining multiple features into one.

Average: A measure of central tendency, specifically the mean value calculated as a total amount divided by the number of entities producing the amount.

Cartographic Process: A cyclic process linking data from the environment as inputs, the final map as an output, as well as the map maker and map user as the creator and consumer of the map.

Cartography: The academic and professional field focused on mapping.

Choropleth Map: A map that depicts quantities aggregated to their regions (often called “enumeration units”) by filling the entire region with a shade or color.

Count: Whole numbers that represent the individual data such as people or housing units.

Delete: Systematically removing data to better serve the purpose of the map such as map legibility.

Density: A count divided by the area of the geographic unit to which the count was aggregated.

Dot Map: Maps that depict magnitude by frequency rather than size of symbol and add the depiction of geographic distribution by use of the graphic variable of location. Specifically, dot maps assign one to many dots per enumeration area to represent a specific count in each area.

Enumeration Areas: Areas or regions in which quantitative data is aggregated to (e.g., census tracts, counties, states, etc.).

Equal Interval: A data classification scheme that divides the data into equal sections (intervals).

Graphic Variables: Primitives in which map symbols are constructed. The core graphic variables include location, size, shape, orientation, texture, and three components of color – color hue (red, green, blue, etc), color lightness (how light or dark the color is), color saturation (how pure the color hue is).

Map Abstraction: The process of representing the real world in simplified form in order to generate a more legible map. It includes at least five major (interdependent) steps: (a) selection, (b) classification, (c) simplification, (d) exaggeration, and (e) symbolization.

Percent: The proportion of a total ranging from 0-100%.

Proportional Symbols: Symbols in which the graphic variable of size is used to depict data magnitude. There are two types of point features typically depicted: features where data represents a geographic position directly and features that are geographic areas to which data are aggregated and the data magnitudes are assigned to a representative point within the area.

Quantile: A general label for any grouping of rank ordered data into an equal number of entities; quantiles with specific numbers of groups go by their own unique labels ("quartiles" and "quintiles," for example, are instances of quantile classifications that group data into four and five classes respectively).

Rate: A quantity that tells us how frequently something occurs, where a value is compared to a standard value.

Reference Map: A map with a main purpose to act as a reference. The prototypical reference map depicts the location of “things” that are usually visible in the world.

Smoothing: The act of eliminating unnecessary elements in the geometry of features, such as the superfluous details of a nation’s shoreline that can only be seen at a larger, zoomed in regional scale.

Thematic Map: A map typically depicting “themes,” generally more abstract, involving more processing and interpretation of data and often representing concepts that are not directly visible; examples include maps of income, health, climate, or ecological diversity.

Typification: A depiction of the most typical components of the mapped feature.

3.5 Bibliography

Muehrcke, P. and Muehrcke, J.O. 1992: Map Use: Reading, Analysis, and Interpretation. 3^rd edition. Madison, WI: JP Publications.

Slocum, T., McMaster, R., Kessler, F. and Howard, H.H. 2009: Thematic Cartography and Visualization. Upper Saddle River, NJ: Prentice Hall.

Brewer, C. & Suchan, T., (2001). Mapping census 2000: The geography of U. S. diversity. U. S. Census Bureau, Census Special Reports, Series CENSR/01-1. Washington, D. C.: U.S. Government Printing Office.

Chrisman, N. (2002). Exploring geographic information systems. (2nd ed.). New York: John Wiley & Sons, Inc.

Monmonier, M. (1995). Drawing the line: Tales of maps and cartocontroversy. New York: Henry Holt and Company.

Roberts, S. (1994). Who we are: A portrait of America based on the latest U.S. census. New York: Times Books.

U.S. Census Bureau (n. d.). Retrieved July 19, 1999, from http://www.census.gov [30]

U.S. Census Bureau (1996). Federal expenditures by state for fiscal year 1995. Retrieved May 9, 2006, from www.census.gov/prod/2/gov/fes95rv.pdf [22]

U.S. Census Bureau (2005). American FactFinder Retrieved July, 19, 1999, from http://factfinder.census.gov [31]

U.S. Census Bureau (n. d.). American FactFinder Retrieved August 2, 2012, from http://factfinder2.census.gov/faces/nav/jsf/pages/using_factfinder5.xhtml [28]

U.S. Census Bureau (2008). A Compass for understanding and using American Community Survey data: What general users need to know. U.S. Government Printing Office, Washington DC, 2008.