The Pitfalls of Spatial Data

Required Reading:

Read the course textbook, Chapter 1: pages 1-21.

Also read: Chapter 3, The Modifiable Areal Unit Problem, pages 29-44 in Lloyd, C. D. (2014). Exploring Spatial Scale in Geography. West Sussex, UK: Wiley Blackwell. This text is available electronically through the PSU library catalog.

Spatial Autocorrelation

The source of all the problems with applying conventional statistical methods to spatial data is spatial autocorrelation. This is a big word for a very obvious phenomenon: things that are near each other tend to be more related than things that are far apart. If this were not true, the world would be a very strange and rather scary place. For example, if land elevation were not spatially autocorrelated, huge cliffs would be everywhere. Turning the next corner, we would be as likely to face a 1000-meter cliff (up or down, take your pick!), as a piece of ground just a little higher or a little lower than where we are now. An uncorrelated, or random, landscape would be extremely disorienting.

The problem this creates for statistical analysis is that much of statistical theory is based on samples of independent observations that are not dependent on one another in any way. In geography, once we pick a study area, we are immediately dealing with a set of observations that are interrelated in all sorts of ways (in fact, that's what we are interested in understanding more about).

Having identified the problem, what can we do about it? Depending on how deeply you want to go into it, quite a lot. At the level of this course, we don't go much beyond acknowledging the problem and developing some methods for assessing the degree of autocorrelation (Lesson 4). Having said that, there are some methods that recognize the problem and take advantage of the presence of spatial autocorrelation to improve analysis. These include point pattern analysis (Lesson 3) as well as interpolation and some related methods (Lesson 6) that recognize the problem and even take advantage of the presence of spatial autocorrelation to improve the analysis.

The Ecological Fallacy

The ecological fallacy may seem obvious, but it is routinely ignored. It is always worth keeping in mind that statistical relations are meaningless unless you can explain them. Until you can develop a plausible explanation for a statistical relationship, it is unsafe to assume that it is anything more than a coincidence. Of course, as more and more statistical evidence accumulates, the urgency of finding an explanation increases, so statistics remain useful.

Required Reading:

Spatial Autocorrelation

The Ecological Fallacy

Navigation

EMS

Programs

Related Links