GEOG 486
Cartography and Visualization

Part I: An Elevation Surface - data preparation

PrintPrint

Digital data files containing values that represent elevation at regularly-spaced intervals are known as Digital Elevation Models, or DEMs. These digital files are typically in very specific formats that require specialized processing before they can be put to use in a GIS software program. Fortunately, in a robust GIS software program, this preprocessing can be performed using commands available in the GIS software.

The GTOPO30 data we are going to use in this lesson are DEM data that I have already preprocessed using ArcInfo.

Go to https://lta.cr.usgs.gov/GTOPO30 and peruse the information that is there. Note that the data are apportioned into tiles. Referring to the map at tiles link, we will be using data from the tile that is second from the west, in the northern-most row (the w140n90 tile). Follow the README link at the bottom of that page to get at a wealth of important information regarding this data, especially the projection information that is in section 3.5.

A. Download Lesson Data

Lesson6.zip (5.8 Mb) - For a review of the download/extraction steps, see Lesson 2, part 1.

The Lesson6 data folder contains four files: The file with the .csv extension is a type of text file. There is a personal geodatabase file named oregon_lesson6.mdb. There are two interchange files, ending in the .e00 extension.

Do not perform these steps - they are for your reference only.

note These are the steps I went through to retrieve the GTOPO30 data and make it usable. Those of you who have access to ArcInfo (which we now supply through the education version of the software) can give this a try when you have the opportunity.

  • Download a GTOPO30 tile. In its compressed format it is still approximately 16 Mb!
  • The file that comes down containing the data we need for this lesson is named w140n90.tar.gz. The file extensions are indicative of the data having been archived. A .tar file is a single file that actually contains a collection of files that are compressed (.gz stands for gzip [GNU zip]) on a computer using the GNU, UNIX or LINUX operating systems).
  • After checking to see if you need to modify your WinZip settings, as per instructions on the GTOPO30 site, you can use WinZip to unpack and decompress the data files. The downloaded information is contained in 8 files. The file with the .DEM extension contains the elevation values (uncompressed it is approximately 56 Mb), and the .HDR file is critical to being able to interpret the data.
    Screen capture to show the files for a tile of data from the GTOPO30 website.
    Figure 6.1.1 The files that pertain to a single tile of data from the GTOPO30 website.

    Screen captures to show the header file that goes with the W140N90.DEM file.
    Figure 6.1.2 The header file that goes with the W140N90.DEM file.
  • It is imperative, when using GIS software, that (1) the file containing the elevation values and the file containing the header information have the same root name, and that (2) they be in the same folder/directory. (The .PRJ comes in handy because we can look to it for the coordinate system information, but it plays no direct part in the processing of the data.)
  • The GTOPO30 data is stored in binary format. Our goal is to convert the binary .DEM file into a raster dataset that ArcGIS can work with.The raster dataset that we will end up with is known as a Grid. Grids are raster datasets in a format specific to Esri software. To convert the .DEM file into an Esri Grid we have to fool the GIS software into thinking that the data in the .DEM file is what is known as a Binary Interleaved by Line file. Files containing this type of data end in a .bil extension.
  • In Windows Explorer, change the .DEM extension to .bil.
  • In ArcToolbox (ArcInfo level licensing) use the Image to Grid tool to convert the .bil file to an Esri Grid. This is one of the functions that is not available in ArcView.
  • I performed an additional step in ArcInfo. For the sake of the lesson you do not need all of the elevation data contained in the GTOPO30 tile that I downloaded. So, I clipped out a portion of the Esri Grid that I created.
    A screen capture to show the clipped (red) portion of the original GTOPO30 Figure.
    Figure 6.1.3 The black and white areas show the extent of the original dataset downloaded from the GTOPO30 website. The red area shows the extent that was clipped out and supplied to you for this lesson.

    The Clip tool is in the Projections and Transformations > Raster toolbox.

B. Import a Grid dataset

In this section you will work with the gtopo30_grd.e00 file that you downloaded as part of the lesson data. This is the raster grid that I clipped from W140N90 grid (via the steps described above).

The gtopo30_grd.e00 file is a file that is in the Interchange File format. This .e00 file contains a raster dataset that is in a format specific to ArcGIS. The dataset is simply referred to as a Grid. In your travels through the ArcGIS software you will see it referred to as an Esri Grid.

A grid dataset is composed of two folders, just like a coverage dataset. After you import the contents of the .e00 file you will see, via Windows Explorer, an info folder and a folder named after the grid dataset.

Let's go ahead and import the grid.

  1. In ArcCatalog, open the ArcToolbox window and open Conversion Tools > To Coverage, then select and open Import from E00
  2. The Input file: is the gtopo30_grd.e00 you just extracted.
  3. Direct the Output dataset: to your Lesson6 folder, and name the coverage you are importing llgtopo30_grd. (The ll is for lon-lat.)

(Another quick aside: you are allowed only 13 characters in grid dataset names. And you will also find out here if there are any spaces inthe full path name of the .e00 file. The GIS does not like spaces anywhere in file path names.)

C. Let's take a look at the properties of the grid

  1. In ArcCatalog, right-click llgtopo30_grd and select Properties...
  2. Scroll down to the Spatial Reference section.
    The coordinate system of the dataset is undefined. But the information that we found on the website enables us to define the coordinate system information.
  3. Click the Edit... button and proceed to supply the coordinate system definition.
    Recall that the coordinates of the data are stored as decimal degrees (DD), in terms of the WGS84 spheroid. This implies that the projection is Geographic. You will be able to find WGS84 (or WGS 1984) in the World list.
    Before we shift your attention from the Spatial Reference properties, take note of the Extent values. How many degrees of longitude and latitude are each of the X and Y extents? ___________
  4. Now, scroll up to the Raster Information properties section and take note of the number of Columns and Rows, and the Cellsize. Then scroll down in order to see all of the Statistics properties, and take note of the Maximum (elevation) value.
  5. Let's look closely at some of the information I asked you to take note of. The values are listed below. Does it make sense that the maximum elevation value is 55,537?
    • You should see that the number of rows and columns are both 1200
    • The cellsize in both the X and Y directions is roughly 0.008333
    • The Maximum value (elevation) is 55537

You may have read in the information at the GTOPO30 website that the resolution of the data is 30-arc seconds. Based on that resolution information, and the width, in degrees, of the XY extent you noted above, you should be able to justify the number of rows and columns of data, and the cellsize dimension. (Hint: There are 60 seconds in a minute, and 60 minutes in a degree.)

Now, what about the maximum elevation value of 55537?
The elevation values are in meters. Do you know of any mountain peaks{C}that are 55,537 meters high?! That equates to over 182,000 feet! The highest peaks in the Rocky Mountains are in the range of 14-16,000 feet. So the maximum value, in meters, should be in the range of 4000 to 5000. Something is definitely amiss here.

Let's open ArcMap, take a look at the dataset, and then deal with these out-of-sight elevation values!

D. The symbolization of a continuous surface

  1. Open a new empty ArcMap document, and save it to your Lesson6 folder as gtopo30_lonlat.mxd.
  2. Add the llgtopo30_grd.
  3. Go ahead and build the pyramids when prompted.
    In the table of contents legend for the layer, you see verification of the data ranging to a maximum of 55537.
    The default symbolization of the data values in the grid dataset depicts the data in shades of gray. The highest data value is shown as white and the lowest as black, with the other values being represented by intermediate shades of gray.
    Depending upon you computer monitor, you may see some faint graytone differences in the image in your map display area. Chances are that it just looks completely black.

    A screen capture to show the black grid dataset in your map display area when first viewed.
    Figure 6.1.4 The gtopo30ll_grd grid dataset when first viewed.
    Let's get a better idea of the nature of the data.
  4. Right-click on the name of the grid in the table of contents pane and bring up the Properties.
  5. Click the Symbology tab.
  6. In the Show: pane, select Unique Values.
  7. Using the slider on the right, scroll to the bottom of the list of elevation values, labels and counts (the number of times each unique elevation value occurs). As you do so, keep your eye on the Value and Count columns.
    Yikes! Look at how many 55537 values there are. It turns out these correspond to any part of the grid that is below sea level, so there are going to be a lot of them.
    In case you did not notice, when you first went to the Symbology tab, the default display mode was Stretched, with a Type of Standard Deviation selected. After reading the following note you should have a better idea of why the land area was virtually all black.

Image/Grid value stretching

To understand why the rendering of the llgtopo30_grd is virtually all black we need to understand the concept of stretching of the image or grid values. The following discussion elaborates on this topic.

Stretching the values of a grid is commonly done to continuous, surface, or ordinal data. The result is a range of transformed values from 0 to 255. This is useful when displaying an elevation model as continuous shades of gray, or a cost surface as continuous shades of red through green.

There are two types of stretches available: linear and equal area.

Linear stretch
A linear stretch rescales the cell values from their original range to a range from 0 to 255 (which is the maximum number of colors that can be simultaneously displayed on an 8-bit device). An equal-area stretch also rescales the values to a range from 0 to 255, but distributes the values in such a way that an equal percentage of the grid is assigned to each of the 256 possible values. For example, a grid with cell values from 5 to 140 can be represented by the following histogram that shows the distribution of the cell values.

A graph shown to illustrate that cell values are indistinguishable because the original data range is often so small.
Figure 6.1.5 The original data range is often so small that when displayed, cell values are indistinguishable.

If this grid is displayed without any stretch, the resulting display would contain little contrast between different cell values because most of the data is concentrated within a narrow range of values. A linear stretch expands the cell values to a range of values from 0 to 255 to make use of the total range or sensitivity of the output device. Stretching increases the numerical distance between cell values that were initially in close proximity.

Contact the instructor if you have difficulty viewing this image
Figure 6.1.6 Using a linear stretch to expand the cell values makes individual cell values more distinguishable when displayed.

The stretching parameters for the linear stretch are calculated from the mean cell value and the standard deviation. All cell values below the mean minus two standard deviations are assigned to 0, and all cell values above the mean plus two standard deviations are assigned to 255. The cell values that fall within these two parameters are stretched along the range from 0 to 255. To illustrate the stretching process, consider a grid that contains cell values in the range from 5 to 140, with a mean of 62 and a standard deviation of 16. A linear stretch would assign a cell value of 0 to all input cell values below 30, and 255 to all input cell values above 94. All cell values between 30 and 94 would be distributed proportionally throughout the range from 0 to 255.

In the Symbology dialog window for a grid layer, the Linear Stretch is referred to as Standard Deviation stretch.

Equal-area stretch
Like a linear stretch, an equal-area stretch spreads the cell values along the range of values from 0 to 255, but the data is also redistributed so that an equal number of cells is assigned to each of the 256 possible output values.

A graph shown to illustrate the effects of equal-area stretch.
Figure 6.1.7 Equal-area stretch changes both the distribution of cell values and the number of cells assigned to each value.

The parameters for the equal-area stretch are based upon the full range and distribution of data values, and cannot be used with floating-point data.

In the Symbology dialog window for a grid layer, the Equal-area Stretch is referred to as Histogram Equalize stretch.

E. Map Algebra

The GTOPO30 dataset only contains elevation values for land areas. As the information on the website and in the header file states, the values of the rectangular raster datasets that correspond to ocean area have been encoded with values of -9999. Recall that these are raster datasets, and regardless of whether or not there are valid elevation values for each column, row member of the dataset, there will be some value associated with those locations. A -9999 value is commonly used to represent grid cell values that have no known or valid value. Unfortunately, when the -9999 is encoded as binary, ArcGIS is unable to interpret the negative sign. When the data is converted to the Esri Grid format, the binary numbers in the data file that represent the -9999 get misinterpreted as base-10 numbers equal to 55537.

In this section we will tap into the power of what is known as Map Algebra to convert the values of 55537 to what they were intended to be: null, or NODATA values.

The concept of NODATA:
It is not unusual in a grid or raster dataset to have cells that, for various reasons, we cannot assign values to. Usually it is a case of not knowing what a valid value for the cell should be. Since a value of 0 is often a valid value in the context of what ever the dataset is, it should not be assigned to such cells. To accommodate this situation, a value of NODATA is assigned to cells where we do not know what the value should be, or when we want to exclude those cells from grid-based analysis (an overlay analysis, for example). NODATA values are often referred to as null values.

Map Algebra:
Map algebra is a language specifically designed for geographic cell-based systems. The language orders the user's thoughts and provides the rules and syntax needed for the user to communicate with the computer and other users.

"Map algebra is a high-level computational language for describing cartographic modeling. This language establishes a set of conventions for data-processing control. The conventions describe how operations are specified, the data to operate on, and the order in which the operations should be processed. "

"The language has been designed for ease of use and understanding. It resembles a cross between algebra and normal prose. While the prose influence allows for intelligible statements, the algebra maintains the power of the mathematical base underlying the cell-based structure."

(I quoted the above statements from an older version of the Esri Help document. I enjoy the contrast they provide to the too often staid phrasing of documentation.)

The general form of a Map Algebra expression is: output grid = function (expression)

A new output grid is created by the operation that takes place on the right side of the equals sign, and the expression always includes an input grid.

Let's see how this is done:

  1. Cancel out of the Layer Properties window if you have not already done so.
  2. Add the Spatial Analyst tool bar to your ArcMap session.
    The llgtopo30_grd should show in the Layer window.
    If all but the Spatial Analyst dropdown menu is grayed out, you will need to:
    • Click Customize > Extensions...
    • Check the box next to Spatial Analyst, and Close.
  3. Open ArcToolbox > Spatial Analyst Tools > Map Algebra > Raster Calculator...
    The Tool Help button is the gateway to a lot of information regarding map algebra and raster dataset manipulation.
  4. In the text box in the lower portion of the Raster Calculator window, create the following map algebra expression.
    Do not type it exactly as you see it below. Instead, follow the bulleted instructions.
    SetNull("llgtopo30_grd" == 55537,"llgtopo30_grd")
    • double-click SetNull in the Conditional list of commands on the right side of the dialog box.
    • with the cursor after the first parenthesis, before the comma, double-click the llgtopo30_grd name in the Layers and variables list. It will show up in the expression with quotes.
    • click the == sign button. A double-equal sign should appear in the expression. This is what is known as a relational operator. Nothing is being set equal to anything with it. Rather, the expression X = = 5, for example, is read as, "Does X equal 5?"
    • type the 55537
    • move the cursor to the other side of the comma and again, double-click the llgtopo30_grd name in the Layers and variables list
  5. Click the folder icon next to the Output raster window and name the new raster ndgtopo30_grd and place it in your Lesson 6 directory. (The nd in the name of the output grid is for nodata).

    Compare what you see in Figure 6.1.8, below, to the expression you have entered.

    A screen capture to show the Raster Calculator window, complete with map algebra expression.
    Figure 6.1.8 The map algebra expression for implementing the SETNULL function.

    The 13-character limit is still in force for the names of new grids that you create in this way.

  6. Hit the OK button, and it will create the new grid, as long as the expression you created is valid. If you made a mistake in creating the expression, you will get a message window that says, "Failed to evaluate the calculator equation."
    By default the new image is added to your ArcMap table of contents, and is symbolized according to the Stretched method. Open the properties window for the new ndgtopo30_grd layer and experiment with changing the symbolization method.
  7. The SETNULL function
    What did you just do? You created a new grid with exactly the same number of cells, and in the same row-column configuration as the input grid specified in the expression in front of the comma within the parentheses. In the process, by virtue of the Setnull function, you replaced all of the values of 55537 with NODATA, or null values. A breakdown of the map algebra expression shows how this happened. First of all, understand that the Setnull function in one of a class of functions that works on the cells in a grided dataset one cell at a time. According to the expression in front of the comma, each cell of the specified input grid (llgtopo30_grd) is evaluated (the double = sign) to see whether or not it is equal to 55537. If the result of the evaluation is true, then a NODATA value is assigned to that grid cell position in the output grid. If, however, the result of evaluating the expression is false (the cell value is not equal to 55537), then the expression following the comma is executed. In this case, the expression following the comma is simply the name of the llgtopo30_grd grid. So, the value of the cell in llgtopo30_grd gets put into the corresponding cell in the output grid. So, as a result, the new output grid cells will contain either NODATA values or the original valid elevation values.
  8. Look at the upper limit of the data. Is it reasonable now? (Look back at the end of Section C, to be reminded of what elevation range we should expect.)
  9. Save your ArcMap document.

F. Changing the projection of a grid

In Lesson 7 you will see the need for displaying spatial vector data in projected coordinates rather than in terms of "flattened" lon-lat coordinates. If we are going to employ grided data in our display and in our analysis, we also need to be able to cast it in different map projections.

Arcview Help

Before you go through the steps to reproject the dataset, I want you to be aware of the fact that the data in a grid/raster dataset is actually resampled when you transform it into a different map projection. Rather than attempt to elaborate upon this point here, I am going to send you to the Desktop Help.

Open ArcGIS Desktop Help and click the Search tab. Type the keyword resample and choose Resample (Data Management) from the topics list. Pay attention to the section on Resampling Technique parameters, specifically the Bilinear option. It is the recommended method when changing the coordinate system of data comprising a continuous surface, like our DEM data.

  1. In ArcToolbox, expand Data Management Tools > Projections and Transformations > Raster, and start the Project Raster tool.
    • Set the Input raster to your ndgtopo30_grd grid. (It should be in your Lesson6 folder.)
    • The entry in the Output raster slot should fill with the path to your Lesson6 folder, giving the output grid some default name. Change the name of the output grid to eagtopo30_grd. (The ea is for equal area.) Make certain you are still directing the new grid to your Lesson6 folder.
    • Click on the icon at the right end of the Output coordinate system slot.
      • Expand Projected Coordinate Systems > Continental > North America.
      • Highlight USA Contiguous Albers Equal Area Conic but do not click OK yet.
        First, we need to center the projection on Oregon.
      • Right-click the highlighted USA Contiguous Albers Equal Area Conic and from the menu that pops up select Copy and Modify...
      • In the Projection area of the Projected Coordinate System Properties window, supply the following values for the projection parameters listed:
        • Central Meridian: -120.5
        • 1st Standard Parallel: 41.3
        • 2nd Standard Parallel: 45.6
        • Latitude of Origin: 44.2
      • Now, in the Geographic Coordinate System area of the same window, click the Change button.
      • Expand the World list and select WGS 1984.prj, then click the OK button.
      • Click the OK button in the Projected Coordinate System Properties window.
      • Verify that the new information is correct in the Spatial Reference Properties window, and then click OK.
    • You should be back at the Project Raster dialog window. Expand the Resampling technique list and select Bilinear.
    • Now, overwrite the value in the Output cell size slot with 800.
    • Click the OK button.
  2. Close the Project Raster window when the reprojection process completes. There should now be a eagtopo30_grd grid dataset in your Lesson6 folder.
    note Regarding the cell size and the resampling method that you specified above:
    The projection of the dataset was changing from units of decimal degrees (lon-lat) to meters (Albers Equal Area Conic). The cell side dimension of the raw data is 30 arc-seconds. The information on the GTOPO30 website mentions that this is approximately 1 kilometer. But think about that. As you go north from the equator, the grid cell sides that parallel longitude will remain about 1 km in length, but the cell sides that parallel latitude will shorten. That lessening in the length of the top and bottom cells sides is a function of the cosine of the latitude. So, a grid cell that is 30 arc-seconds on a side at the equator has top and bottom sides that are approximately 1000 meters long (1 km). But a grid cell that is 30 arc-seconds on a side at a latitude of 45 degrees has top and bottom sides that are approximately equal to 710 meters. (cos 45 = .710)
    Oregon is approximately centered on 44 degrees north latitude. So the top and bottom cell side is approximately 720 meters. Grid cells in raster datasets must be square, so I chose to make the cells of the new, projected grid 800 meters on a side, erring on the small side of an average of 720 and 1000 to hopefully better represent the actual data at higher latitudes.
    The tool also gives a choice of interpolation method. I chose to use the bilinear interpolation method. If you looked at the Help explanation regarding resampling grids that I pointed out above, you will know why.

That is it for Part I

You have just completed Part I of this project.