Kriging is a stochastic, local interpolation technique that uses information about the spatial structure of the attribute of interest (i.e., the information contained in the sample points) to estimate the value of that attribute at unknown locations. Kriging is very similar to inverse distance weighting in that it also uses a weighted average of sample points to estimate values at unknown points; the main difference between the two methods lies in how those weights are specified. In IDW, the cartographer arbitrarily specifies a neighborhood of points that should influence the estimation, as well as the strength of the distance effect (i.e., the similarity of points at a given distance). In kriging, however, we use statistics to decide on a set of weights that will be most likely to correctly predict the unknown values (i.e., we produce a statistically optimal set of weights).

Kriging assumes that the variation in a surface can be broken down into three main components: a drift or overall trend, local spatial autocorrelation (i.e., that points that are close together are more likely to have similar values), and random stochastic variation (i.e., noise or measurement error). The drift can be estimated with a mathematical function that approximates the trend in the surface. Here, we will focus on understanding how kriging deals with the other two components of spatial variation: local spatial autocorrelation and random stochastic variation.

The first step in kriging is to use the sample data to describe the spatial variation in the surface. We can do this by considering the concept of the semivariance, which is a statistical measure of how much variation there is in the attribute we are interested in when two points are separated by a particular distance (see Figure 6.cg.29, below). Typically, in any given dataset, we might only have one set of points that is a particular distance apart from each other. Unfortunately, statistically, this is not the best scenario, as we can't be sure that the semivariance for that set of points is typical of all of the potential sets of points that are that distance apart (when we consider both samples and estimated points). As with all statistics, the more samples we have, the more sure we can be about the predictions we make based on those samples. So we can use the concept of a distance lag to group sets of points that are similar (but not exactly the same) distances apart and calculate an average semivariance from those points.

Once we have a description of the spatial structure of the sample data (in the form of a semivariogram), we can look for a mathematical function (i.e., equation) that best fits those points (i.e., one that minimizes the distance between all of the sample points and the line described by the function). One common type of equation that is fit to semivariograms is a spherical function (see Figure 6.cg.30, below). We can extract three important pieces of information from this spherical equation: the nugget, the sill and the range. The nugget is the value at which the function meets the y-axis; this value is usually not at the origin (i.e., 0,0 point) of the graph. We can interpret this value as a measure of the amount of random stochastic variation (i.e., noise) present in the data sample. This makes intuitive sense when you consider that points that are in the same location (i.e., no distance apart) should have the same value. In practice, if you make repeated measurements at the same place, you may not get equal values if there is some measurement noise. The second important value is called the sill, which is the highest level of semivariance in the data set. A final important value is the range, which is the distance at which the semivariance stops increasing. In other words, at distances that are greater than the range, points are unlikely to be similar - Tobler's law is no longer working at these distances.

Once we have fit an equation to the semivariogram, we can use it to calculate weights for estimating the values of our attribute of interest at unknown locations. We accomplish this by first simply measuring the distance between sample points and the location of the value we want to predict and reading the semivariance that the graph predicts for that distance. Then, these semivariances are used to solve a series of linear equations whose weights will produce an interpolation that minimizes the amount of error in the predicted values. Although the mathematics of these equations are beyond the scope of this lesson, the end result is a set of weights that is then used along with the values of the sample points to predict the values of unknown locations. Although kriging is conceptually more difficult to understand than inverse distance weighting, it allows the structure of the data to influence the weighting of sample points on the interpolation rather than the arbitrary decisions made by the cartographer in inverse distance weighting.

#### Recommended Readings

If you are interested in investigating this subject further, I recommend the following:

- Robinson, T. B. and G. Metternicht. 2003. "A comparison of inverse distance weighting and ordinary kriging for characterizing within-paddock spatial variability of soil properties in Western Australia." Cartography. 32(1).