Having taken a GIS class or two, most of you have probably heard the terms vector and raster before. Here we will review these types of data models and talk about why you as cartographers should be interested in them.
Since the early days of GIS, researchers have been talking about and debating the relative merits of different ways of conceptualizing and representing geography digitally. In the scientific literature, this endeavor is known as creating an ontology of geographical phenomena. Ontology is a branch of metaphysics that is concerned with questions about being or existence. GIScientists have identified two main ontologies of geographic phenomena: one is object-based, the other is field-based.
An object-based ontology describes the world as a space that is filled with discrete, identifiable units (i.e., objects) that have some sort of spatial reference, usually in the form of geographic coordinates. For example, some objects you might find in this space could include: houses, factories, roads, rivers, lakes, or pollution plumes. A field-based ontology, on the other hand, describes the world as a collection of spatial distributions of phenomena. In other words, for a particular attribute or theme we are interested in (e.g., elevation or temperature), we look at all of the locations in the particular space we are interested in and determine how much of that attribute or what category of that attribute is there. In this world-view, you might think of location as being an independent variable, and the attribute of interest as being a dependent variable (Worboys, 1995). As GIScientists, we are not usually really worried about whether some phenomenon (e.g., a mountain or lake) exists, but in how best to describe that phenomenon using numbers in a computer. A vector data structure is a computer implementation of an object-based ontology, while a raster data structure is a field-based implementation.
To review the basics, the data in a vector data structure are at the most basic level a collection of points with geographic coordinates. We can represent objects as points, lines (a collection of points) and areas (a closed collection of points) (see Figure 1.6.1). Each element in this space is discrete and homogenous (see Figure 1.6.2). Some advantages of the vector data structure are that file sizes are generally small (but they increase with an increasing number of features that are in the space), we can be quite precise in defining objects (i.e., we can define them at a high-spatial resolution), and that we can store multiple attributes with each object. However, because vector data structures are composed of homogenous objects, we do not have any information about the variation within an object. This may or may not be important in the context of your problem. Take the case of temperature. Variation within an object may not be important when you need to consider an object such as a factory, because it is likely to be the same throughout the factory. However, variation within an object might be very important in the case of a river; if the river is warmer downstream from a factory than upstream, it may be because the factory has discharged warm wastewater into the river.
The raster data structure uses a regular grid to cover space, and records an attribute value for each location of the grid cell (see Figure 1.6.3). This data structure is continuous; each location in space has a value assigned to it (although that value may be 0). Generally, raster data file sizes are larger than those of vector files, because they store information about every point in space, not just those points where there is an object related to the phenomenon the data are representing. Raster data file size is also dependent upon the resolution of the data (i.e. the precision with which the phenomenon of interest can be described). Data that are captured at higher spatial resolutions result in higher file sizes (see Figure 1.6.4). Although objects are not explicitly represented in raster data, humans can identify them if there are abrupt changes in the value of the represented attribute within the space.
So why should cartographers be concerned with how data are represented in a GIS? Aside from matters that influence the content of the map (e.g., are the input data available at the proper spatial resolution for the particular map purpose?), there are two important reasons: there are two types of map output formats (vector and raster), and the format the data are in can influence how the map output looks. We will discuss each of these topics in more detail in Part VIII: Exporting a Map, and in Part VII: Print/Display Resolution.
If you are interested in investigating this subject further, I recommend the following:
- Cova, T. J. and M. F. Goodchild. (2002). "Extending geographical representation to include fields of spatial objects." International Journal of Geographical Information Science. 16(6), p. 509-532.
- Smith, B. and D. M. Mark. (2003). "Do mountains exist? Towards an ontology of landforms." Environment and Planning B: Planning and Design. 30(5), p. 411-427.
- Worboys, M. (1995) GIS: A Computing Perspective. New York: Francis & Taylor. Chapter 4, p. 145-179