GEOG 586
Geographic Information Analysis

Project 2: Setting Up Your Analysis Environment in R

PrintPrint

Adding the R packages that will be needed for the analysis

Many of the analyses you will be using require commands contained in different packages. You should have already loaded many of these packages, but just to ensure we document what packages we are using, let's add in the relevant information so that if the packages have not been switched on, they will be when we run our analysis.

Note that the lines that start with #install.packages will be ignored in this instance because of the # in front of the command. Lines of code with a # symbol are treated as comments by the compiler. If you do need to install a package (e.g., if you hadn't already installed all of the relevant packages), then remove the # when you run your code so the line will execute.

Note that for every package you install, you must also issue the library() function to tell R that you wish to access that package. Installing a package does not ensure that you have access to the functions that are a part of that package.

#install and load packages here

install.packages("ggplot2")
library(ggplot2)
install.packages("spatstat")
library (spatstat)
install.packages("leaflet")
library(leaflet)
install.packages("dplyr")
library (dplyr)
install.packages("doBy")
library(doBy)
install.packages("sf")
library(sf)
install.packages("ggmap")
library (ggmap)
install.packages("gridExtra")
library(gridExtra)
install.packages("sp")
library(sp)
install.packages("RColorBrewer")
library (RColorBrewer)

Set the working directory

Before we start writing R code, you should make sure that R knows where to look for the shapefiles and text files. First, you need to set the working directory. In your case, the working directory is the folder location where you placed the contents of the .zip file when you unzipped it. To follow up, make sure that the correct file location is specified. There are several ways you can do this. As shown below, you can set the directory path to a variable and then reference that variable whereever and whenever you need to. Notice that there is a slash at the end of the directory name. If you don't include this, it can lead to the final folder name getting merged with a filename, meaning R cannot find your file.

file_dir_crime <-"C:/Geog586_Les2_Project/crime/" 

Note that R is very sensitive to the use of "/" as a directory level indicator. R does not recognize "\" as a directory level indicator and an error will return. DO NOT COPY THE DIRECTORY PATH FROM FILE EXPLORER AS THE FILE PATH USES "\" but R only recognizes "/" for this purpose.

You can also issue the setwd() function that "sets" the working directory for your program.   

setwd("C:/Geog586_Les2_Project/crime/")

Finally and most easily, you can also set the working directory through RStudio's interface: Session - Set Working Directory - Choose Directory

Issue the getwd() function to confirm that R knows what the correct path is. 

#check and set the working directory
#if you do not set the working directory ensure that you use the full directory pathname to where you saved the file.
#R is very sensitive to the use of "/" as a directory level indicator. R does not recognize "\" as a directory level indicator and an error will return. 
#DO NOT COPY THE DIRECTORY PATH FROM YOUR FILE EXPLORER AS THE FILE PATH USES "\"
#e.g. "C:/Geog586_Les2_Project/crime/">
#check the directory path that R thinks is correct with the getwd() function.
getwd()

#set directory to where the data is so that you can reference these varibles later rather than typing the directory path out again.
#you will need to adjust this for your own filepath.
file_dir_crime <-"C:/Geog586_Les2_Project/crime/"

#make sure that the path is correct and that the csv files are in the expected directory
list.files(file_dir_crime)
#this is an alternative way of checking that R can see your csv files. In this version of the command, you are asking R to list only the .csv files
#that are in the folder located at the filepath filedircrime.
list.files(path = file_dir_crime, pattern = "csv")

#to view the files in the other directory
file_dir_gis <-"C:/Geog586_Les2_Project/gis/"

#make sure that the path is correct and that the shp file is in the expected directory
list.files(file_dir_gis)
# and again, the version of the command that limits the list to shapefiles only
list.files(path = file_dir_gis, pattern="shp")

#You are not creating any outputs here, but you may want to think about setting up a working directory
#where you can write outputs when and if needed. First create a new directory, in this case called outputs 
#and then set the working directory to point to that directory.
wk_dir_output<-setwd("C:/Geog586_Les2_Project/outputs/")

When you've finished replacing the filepaths above to the relevant ones on your own computer, check the results in the RStudio console with the list.files() command to make sure that R is seeing the files in the directories. You should see outputs that print that list the files in each of the directories. 90% of problems with R code come from either a library not being loaded correctly or a filepath problem, so it is worth taking some care here.

Loading and viewing the data file (Two Code Options)

Add the crime data file that you need and view the contents of the file. Inspect the results to make sure that you are able to read the crime data.

#Option #1: Read in crime file by pasting the path into a variable name
#concatenate the filedircrime (set earlier to your directory) with the filename using paste()
crime_file <-paste(file_dir_crime,"crimeStLouis20132014b.csv", sep = "")
#check that you've accessed the file correctly - next line of code should return TRUE
file.exists(crime_file)

crimesstl <- read.csv(crime_file, header=TRUE,sep=",")

#Option #2: Set the file path and read in the file into a variable name
crimesstl2 <- read.csv("C:/Geog586_Les2_Project/crime/crimeStLouis20132014b.csv", header=TRUE, sep=",")

#returns the first ten rows of the crime data
head(crimesstl, n=10)

#view the variable names
names(crimesstl)

# view data as attribute tables (spreadsheet-style data viewer) opened in a new tab. 
# Your opportunity to explore the data a bit more!
View(crimesstl) #feel free to use the 'View' function whenever you need to explore data as an attribute table
View(crimesstl2)

#create a list of the unique crime types in the data set and view what these are so that you can select using these so that you can explore their distributions.
listcrimetype <-unique(crimesstl$crimetype)
listcrimetype