GEOG 586
Geographic Information Analysis

Geography Department Penn State

Project 3, Part B: Descriptive Spatial Statistics

Descriptive Spatial Statistics

In Lesson 2, we highlighted some descriptive statistics that are useful for measuring geographic distributions. Additional details are provided in Table 3.3. Functions for these methods are also available in ArcGIS.

Table 3.3 Measures of geographic distributions that can be used to identify the center, shape and orientation of the pattern or how dispersed features are in space

Measures of Central Tendency
Spatial Descriptive Statistic	Description	Calculation
Central Distance	Identifies the most centrally located feature for a set of points, polygon(s) or line(s)	Point with the shortest total distance to all other points is the most central feature $\begin{array}{l} D = \sum_{i = 1}^{n} \sum_{i = 1}^{n - 1} \sqrt{{(x j - x i)}^{2} + {(y j - y i)}^{2}} \\ D c e n t r a l = minimum (D) \end{array}$
Mean Center (there is also a median center called Manhattan Center)	Identifies the geographic center (or the center of concentration) for a set of features Mean sensitive to outliers	Simply the mean of the X coordinates and the mean of the Y coordinates for a set of points $\bar{X} = \frac{\sum_{i = 1}^{n} x_{i}}{n}, \bar{Y} = \frac{\sum_{i = 1}^{n} y_{i}}{n}$
Weighted Mean Center	Like the mean but allows weighting by an attribute.	Produced by weighting each X and Y coordinate by another variable (Wi) $\bar{X} = \frac{\sum_{i = 1}^{n} w i x i}{\sum_{i = 1}^{n} w i} \bar{Y} = \frac{\sum_{i = 1}^{n} w i x i}{\sum_{i = 1}^{n} w i}$

Measures of Variability
Spatial Descriptive Statistic	Description	Calculation
Standard Distance	Measures the degree to which features are concentrated or dispersed around the geometric mean center The greater the standard distance, the more the distances vary from the average, thus features are more widely dispersed around the center Standard distance is a good single measure of the dispersion of the points around the mean center, but it doesn’t capture the shape of the distribution.	Represents the standard deviation of the distance of each point from the mean center: $S D = \sqrt{\frac{\sum_{i = 1}^{n} {(x i - \bar{X})}^{2}}{n} + \frac{\sum_{i = 1}^{n} {(y i - \bar{Y})}^{2}}{n}}$ Where xi and yi are the coordinates for a feature and $\bar{X}$ and $\bar{Y}$ are the mean center of all the coordinates. Weighted SD $SDw = \sqrt{\frac{\sum_{i = 1}^{n} w_{i} (x i - \bar{X^{2}})}{n} + \frac{\sum_{i = 1}^{n} w_{i} (y i - \bar{Y^{2}})}{n}}$ Where xi and yi are the coordinates for a feature and $\bar{X}$ and $\bar{Y}$ are the mean center of all the coordinates. wi is the weight value.
Standard Deviational Ellipse	Captures the shape of the distribution.	Creates standard deviational ellipses to summarize the spatial characteristics of geographic features: Central tendency, Dispersion and Directional trends

For this analysis, use the crime types that you selected earlier. The example here is for the homicide data.

#------MEAN CENTRE
#calculate mean centre of the crime locations
xmean <- mean(xhomicide$x)
ymean <-mean(xhomicide$y)

#------MEDIAN CENTRE
#calculate the median centre of the crime locations
xmed <- median(xhomicide$x)
ymed <- median(xhomicide$y)

#to access the variables in the shapefile, the data needs to be set to data.frame
newhom_df<-data.frame(xhomicide)
#check the definition of the variables.  
str(newhom_df)

#If the variables you are using are defined as a factor then convert them to an integer 
newhom_df$FREQUENCY <- as.integer(newhom_df$FREQUENCY)
newhom_df$OBJECTID <- as.integer(newhom_df$OBJECTID)

#create a list of the x coordinates. This will be used to define the number of rows
a=list(xhomicide$x)


#------WEIGHTED MEAN CENTRE
#Calculate the Weighted mean
d=0
sumcount = 0
sumxbar = 0
sumybar = 0
for(i in 1:length(a[[1]])){
  xbar <- (xhomicide$x[i] * newhom_df$FREQUENCY[i])
  ybar <- (xhomicide$y[i] * newhom_df$FREQUENCY[i])
  sumxbar = xbar + sumxbar
  sumybar = ybar + sumybar
  sumcount <- newhom_df$FREQUENCY[i]+ sumcount
}
xbarw <- sumxbar/sumcount
ybarw <- sumybar/sumcount


#------STANDARD DISTANCE OF CRIMES
# Compute the standard distance of the crimes
#Std_Dist <- sqrt(sum((xhomicide$x - xmean)^2 + (xhomicide$y - ymean)^2) / nrow(xhomicide$n))

#Calculate the Std_Dist
d=0
for(i in 1:length(a[[1]])){
  c<-((xhomicide$x[i] - xmean)^2 + (xhomicide$y[i] - ymean)^2)
  d <-(d+c)
}
Std_Dist <- sqrt(d /length(a[[1]]))

# make a circle of one standard distance about the mean center
bearing <- 1:360 * pi/180
cx <- xmean + Std_Dist * cos(bearing)
cy <- ymean + Std_Dist * sin(bearing)
circle <- cbind(cx, cy)


#------CENTRAL POINT
#Identify the most central point:
#Calculate the point with the shortest distance to all points
#sqrt((x2-x1)^2 + (y2-y1)^2

sumdist2 = 1000000000
for(i in 1:length(a[[1]])){
  x1 = xhomicide$x[i]
  y1= xhomicide$y[i]
  recno = newhom_df$OBJECTID[i]
  #print(recno)
  #check against all other points
    sumdist1 = 0
    for(j in 1:length(a[[1]])){
      recno2 = newhom_df$OBJECTID[j]
      x2 = xhomicide$x[j]
      y2= xhomicide$y[j]
      if(recno==recno2){
      }else {
      dist1 <-(sqrt((x2-x1)^2 + (y2-y1)^2))
         sumdist1 = sumdist1 + dist1
         #print(sumdist1)
         }
    }
    #print("test")
    if (sumdist1 < sumdist2){
           dist3<-list(recno, sumdist1, x1,y1)
           sumdist2 = sumdist1
           xdistmin <- x1
           ydistmin <- y1
         }
}

#------MAP THE RESULTS
#Plot the different centers with the crime data
plot(Sbnd)
points(xhomicide$x, xhomicide$y)
points(xmean,ymean,col="red", cex = 1.5, pch = 19) #draw the mean center
points(xmed,ymed,col="green", cex = 1.5, pch = 19) #draw the median center
points(xbarw,ybarw,col="blue", cex = 1.5, pch = 19) #draw the weighted mean center
points(dist3[[3]][1],dist3[[4]][1],col="orange", cex = 1.5, pch = 19) #draw the central point
lines(circle, col='red', lwd=2)

Deliverable

Perform point pattern analysis on any two of the available crime datasets (DUI, arson, or homicide). It would be beneficial if you would choose crimes with contrasting patterns. For your analysis, you should choose whatever methods seem the most useful, and present your findings in the form of maps, graphs, and accompanying commentary.