METEO 810
Weather and Climate Data Sets

Lesson 3 Activity Hints and Tricks

Hints and Tricks...

I know that this activity might be a bit rough for folks who are new to coding.  So, I thought that I would use this page to provide some tips and tricks to help you get started. I have listed them according to each question that you are trying to answer. As more questions come in, I hope to add to this page.

General Thoughts

  • Be careful if you open/edit/save your data file in Excel. Excel changes the format of the date column.  You can still read it, but you will need to use a different format. You can use the following command to properly format dates in R (you can google "R as.Date" for more information.
    # Format a column in dataframe "mydata" to be dates
    # You have to tell R the format of the dates in the file
    mydata$Date<-strptime(mydata$Date, format="%Y-%m-%d")
    
    #but if you edited the file in Excel, you might need:
    mydata$Date<-strptime(mydata$Date, format="%m/%d/%y")
    
  • Make sure that you don't try to do too much in each line of code. If you keep each step separate, you can better figure out what's happening when things go awry.

Question 1: Normal Climate

  • It's best to extract a year's worth of Normals you are going to need to plot. For example:
    # Get a year's worth of dates
    dates1976<-mydata$Date[(mydata$Date >= "1976-01-01" & mydata$Date < "1977-01-01" )]
    
    # Get a year's worth of Max Temperature Normals
    maxtemps1976<-mydata$MaxTemperatureNormal[(mydata$Date >= "1976-01-01" & mydata$Date < "1977-01-01" )]
    
    #etc
    
    Then, make your plots using these new variables.
  • For Normal precipitation, you might want to create a cumulative precipitation graph (which shows average accumulation over an entire year). You can use the function cumsum(...) to perform this task.

Question 2: How Normal is Normal

  • You can easily compute the deviation of the mean daily temperature from normal for a given month by using the following...
    JanmeanTempdiffs<-mydata$AvgTemperature[(strftime(mydata$Date,"%m"))=="01"]-mydata$AvgTemperatureNormal[(strftime(mydata$Date,"%m"))=="01"]
    

Question 3: Cold Winter?

  • Building on what you did in Question 2, there are lots of ways to go about this. What if we created a variable of daily temperature differences for every January day (like we did above) and then averaged them by year. Start with a variable of differences and the dates on which they occur...
    JanmeanTempdiffs<-mydata$AvgTemperature[(strftime(mydata$Date,"%m"))=="01"]-mydata$AvgTemperatureNormal[(strftime(mydata$Date,"%m"))=="01"]
    
    Jan_dates<-mydata$Date[which(strftime(mydata$Date,"%m")=="01")]
    
    Now, to average these by year, we need to use the aggregate(...) function. This function takes one column of data and performs a function on it using grouping (specified in the "by" column). In the case below, I create a list of years by which to do the averaging.
    # use the aggregate function to get a mean January 
    # departure by year.
    meanTempdiffs<-aggregate(JanmeanTempdiffs, 
                             by=list(strftime(Jan_dates,"%Y")), 
                             mean, na.rm=TRUE)
    colnames(meanTempdiffs) <- c("Year", "MeanDiff")
    
    Finally, we can sort the result by coldest January.
    coldest_Jans<-monthyTempdiff[order(monthyTempdiff$MeanDiff),]
    
  • I'm not saying this is the best way to do this. What defines a "cold winter"? There may be several answers.

Question 4: Heat Wave

  • Much like the Cold Winter question, there are some interesting ways however of visualizing this. Histogram? An image plot perhaps?

Question 5: Windy City

  • For reading in the files, you might need to add the parameter skip=2 to read.csv(...).  This will skip the first two header rows of the file (you can rename the columns you want to keep). I also suggest using the parameter, colClasses = "character", so that you can do the data conversions yourself (better this way).
  • The function windRose is picky... You will need your dates/time combined into a single data.frame column called "date". You will also need to make sure that the wind columns are properly labeled, converted to numeric, and have any missing data (999) converted to NA's. Here's the algorithm...
    # read the data
    
    # keep the date, time, direction, and speed columns
    
    # create a single date column. Here's an example
    # I labeled my unformatted date/time accordingly
    wind_data$date<-as.POSIXct(strptime(paste(wind_data$date_uf,wind_data$time_uf), format = "%Y%m%d %H%M", tz="UTC"))
    
    # convert wd and ws columns to numeric and set 999 to NA
    
    # plot the wind rose