GEOG 482
The Nature of Geographic Information

4. Enumerations versus Samples

PrintPrint

Sixteen U.S. Marshals and 650 assistants conducted the first U.S. census in 1791. They counted some 3.9 million individuals, although as then-Secretary of State, Thomas Jefferson, reported to President George Washington, the official number understated the actual population by at least 2.5 percent (Roberts, 1994). By 1960, when the U.S. population had reached 179 million, it was no longer practical to have a census taker visit every household. The Census Bureau then began to distribute questionnaires by mail. Of the 116 million households to which questionnaires were sent in 2000, 72 percent responded by mail. A mostly-temporary staff of over 800,000 was needed to visit the remaining households, and to produce the final count of 281,421,906. Using statistically reliable estimates produced from exhaustive follow-up surveys, the Bureau's permanent staff determined that the final count was accurate to within 1.6 percent of the actual number (although the count was less accurate for young and minority residences than it was for older and white residents). It was the largest and most accurate census to that time. (Interestingly, Congress insists that the original enumeration or "head count" be used as the official population count, even though the estimate calculated from samples by Census Bureau statisticians is demonstrably more accurate.)

The mail-in response rate for the 2010 census was also 72 percent. As with most of the 20th century censuses the official 2010 census count, by state, had to be delivered to the Office of the President by December 31 of the census year. Then within one week of the opening of the next session of the Congress, the President reported to the House of Representatives the apportionment population counts and the number of Representatives to which each state was entitled.

In 1791, census takers asked relatively few questions. They wanted to know the numbers of free persons, slaves, and free males over age 16, as well as the sex and race of each individual. (You can view replicas of historical census survey forms here) As the U.S. population has grown, and as its economy and government have expanded, the amount and variety of data collected has expanded accordingly. In the 2000 census, all 116 million U.S. households were asked six population questions (names, telephone numbers, sex, age and date of birth, Hispanic origin, and race), and one housing question (whether the residence is owned or rented). In addition, a statistical sample of one in six households received a "long form" that asked 46 more questions, including detailed housing characteristics, expenses, citizenship, military service, health problems, employment status, place of work, commuting, and income. From the sampled data, the Census Bureau produced estimated data on all these variables for the entire population.

In the parlance of the Census Bureau, data associated with questions asked of all households are called 100% data and data estimated from samples are called sample data. Both types of data are available aggregated by various enumeration areas, including census block, block group, tract, place, county, and state (see the illustration below). Through 2000, the Census Bureau distributes the 100% data in a package called the "Summary File 1" (SF1) and the sample data as "Summary File 3" (SF3). In 2005, the Bureau launched a new project called American Community Survey that surveys a representative sample of households on an ongoing basis. Every month, one household out of every 480 in each county or equivalent area receives a survey similar to the old "long form." Annual or semi-annual estimates produced from American Community Survey samples replaced the SF3 data product in 2010.

To protect respondents' confidentiality, as well as to make the data most useful to legislators, the Census Bureau aggregates the data it collects from household surveys to several different types of geographic areas. SF1 data, for instance, are reported at the block or tract level. There were about 8.5 million census blocks in 2000. By definition, census blocks are bounded on all sides by streets, streams, or political boundaries. Census tracts are larger areas that have between 2,500 and 8,000 residents. When first delineated, tracts were relatively homogeneous with respect to population characteristics, economic status, and living conditions. A typical census tract consists of about five or six sub-areas called block groups. As the name implies, block groups are composed of several census blocks. American Community Survey estimates, like the SF3 data that preceded them, are reported at the block group level or higher.

Diagram of relationships among the various census geographies
Figure 3.4.1 Relationships among the various census geographies.
U.S. Census Bureau, American FactFinder, 2005, http://factfinder.census.gov/
An updated source for the diagram can be found at https://www.census.gov/geo/reference/hierarchy.html).

Try This!

Acquiring U.S. Census Data via the World Wide Web

The purpose of this practice activity is to guide you through the process of finding and acquiring 2000 census data from the U.S. Census Bureau data via the Web. Your objective is to look up the total population of each county in your home state (or an adopted state of the U.S.).

  1. Go to the U.S. Census Bureau site at http://www.census.gov.
  2. At the Census Bureau home page, click on the Data (Tools, Developers) tab, then expand the Data Tools and Apps pick list and select American FactFinder. American FactFinder is the Census Bureau's primary medium for distributing census data to the public.
    On the American FactFinder home page click on the American FactFinder link; the top item in the bulleted list.
  3. Expand the ADVANCED SEARCH list, and click on the SHOW ME ALL button. Take note of the three numbered steps featured on the page you are taken to. That’s what we are about to do in this exercise.
  4. Click the blue Topics search option box. In the Select Topics overlay window expand the People list. Next expand the Basic Count/Estimate list. Then choose Population Total. Note that a Population Total entry is placed in the Your Selections box in the upper left, and it disappears from the Basic Count/Estimate list.
    Close the Select Topics window.

    The list of datasets in the resulting Search Results window is for the entire United States. We want to narrow the search to county-level data for your home or adopted state.
     
  5. Click the blue Geographies search options box. In the Select Geographies overlay window that opens make sure the List tab is selected.  Under Select a geographic type:, click County - 050.
    Next, select the entry for your state from the Select a state list, and then, from the Select one or more geographic areas.... list, select All counties within <your state> .
    Last, click ADD TO YOUR SELECTIONS. This will place your All Counties… choice in the Your Selections box.
    Close the Select Geographies window.
  6. The list of datasets in the Search Results window now pertains to the counties in your state. Take a few moments to review the datasets that are listed. Note from the Dataset column that there are SF1, SF2, ACS (American Community Survey), etc., datasets, and that if you page through the list far enough you will see that data from past years is listed. We are going to focus our effort on the 2010 SF1 100% Data.
  7. Given that our goal is to find the population of the counties in your home state, can you determine which dataset we should look at?
    There is a TOTAL POPULATION entry for 2010. Find it, and make certain you have located the 2010 SF1 100% Data dataset. (You can use the Refine your search results: slot above the dataset list to help narrow the search.)
    Once you find the TOTAL POPULATION / 2010 SF1 100% Data entry, check the box for it, and then click View.
    In the new results window that opens, you should be able to find the population of the counties of your chosen state.
    Note the row of Actions:, which includes Print and Download buttons.
     

I encourage you to experiment some with the American FactFinder site. Start slow, and just click the BACK TO ADVANCED SEARCH button, un-check the TOTAL POPULATION dataset and choose a different dataset to investigate. Registered students will need to answer a couple of quiz questions based on using this site.
Pay attention to what is in the Your Selections window. You can easily remove entries by clicking the red circle with the white X.

On the SEARCH page, with nothing in the Your Selections box, you might try typing “QT” or “GCT” in the step 1 topic or table name: slot. QT stands for Quick Tables which are pre-made tables that show several related themes for one or more geographic areas. GCT stands for Geographic Comparison Tables which are the most convenient way to compare data collected for all the counties, places, or congressional districts in a state, or all the census tracts in a county.

Penn State logo
Students who register for this Penn State course gain access to assignments and instructor feedback, and earn academic credit. Information about Penn State's Online Geospatial Education programs is available at the Geospatial Education Program Office.