You will use the knowledge and scripts you have gained from this lesson to assess feasibility of using reanalysis climatology in data-sparse regions.
Let's get started.
In this lesson, we practiced retrieving data from NetCDF and GRIB files by looking at the North American Regional Reanalysis data set. One might argue the need for such a data set as a historical reference over observation-rich regions. However, over the oceans, NARR data may be the only source of climatological data available. But how accurate are the data? In this assignment, we will examine a host of observations from ocean buoys and then compare those observations with NARR data. Our goal is not necessarily to make fine-scale comparisons, but rather determine on what scales the NARR data are an acceptable proxy for climatologies over the oceans.
Please use R to answer the questions posed below. In some cases, you will need to paste the output from R, and in other cases, you will need to provide a written analysis along with a graphic or two.
# check to see if rnoaa has been installed # add the library and NOAA key # Get the ISD data # Grab the following columns # Note: your station might not have all of these isd_trimmed <- isd_data[c("date","time", "wind_direction","wind_direction_quality", "wind_speed", "wind_code", "wind_speed_quality", "temperature","temperature_quality", "SA1_temp","SA1_quality")] # Throw out all values where the quality flag is not "1" # You might find that "9" is acceptable for SA1_temp # You can check your data first with: table(isd_trimmed$temperature_quality) # This shows you how many of each temperature quality flags you have isd_trimmed$temperature[isd_trimmed$temperature_quality!=1]<-NA isd_trimmed$SA1_temp[isd_trimmed$SA1_quality!=9]<-NA # Fix the wind values so that calm winds have speed=0 # and direction=NA # Fix the date/time code # Change the coded values to actual values # make a new data frame for the windRose plot (remember it's picky) # Put your plotting code here
# load ncdf4 if we need to # include the library # open the two temperature CDF files # set the location of our station # determine the dimensions of the data set? #Get the properly formatted date strings # obtain the lat/lon grids # find the grid index of closest point # create the temperature arrays... Notice the different dim parameter temp2m_out = array(data = NA, dim = c(1,1,data_dims[3])) tempsfc_out = array(data = NA, dim = c(1,1,data_dims[3])) # load the 2m temperature array and convert it to C # load the sfc temperature array and convert it to C # Make a new dataframe to hold the data # Put any plotting codes here
# load ncdf4 if we need to # include the library # open the two wind CDF files # set the location of our station # determine the dimensions of the data set? #Get the properly formatted date strings # obtain the lat/lon grids # Find the grid index of closest point # Create the wind arrays # load the u-wind array uwind_out[,,] = ncvar_get(nc1, varid = 'uwnd', start = c(closest_point[1],closest_point[2],1), count = c(1,1,data_dims[3])) # get the v-wind array vwind_out[,,] = ncvar_get(nc2, varid = 'vwnd', start = c(closest_point[1],closest_point[2],1), count = c(1,1,data_dims[3])) # Calculate speed and direction from U and V components NARR_windsp<-sqrt(uwind_out[1,1,]^2+vwind_out[1,1,]^2) NARR_winddir<-(270-(atan2(vwind_out[1,1,],uwind_out[1,1,])*(180/pi)))%%360 # Create a new dataframe with the final data NARR_wind_data<-data.frame(ncdates,NARR_winddir,NARR_windsp) colnames(NARR_wind_data)<- c("timecode","wd","ws") # Put the windRose(...) plotting code here
merge(...)
that does exactly this. I have provided a few more lines of code below that will help you merge your data together. Show a scatter plot of all_data$T2m
vs all_data$temperature
, or all_data$SST
vs all_data$SA1_temp
. What do these graphs tell you about the comparability of the model data versus the observations? Plot the error between observation and model. Is a histogram a good way to do this? How about the error as a function of model temperature (maybe a box plot)?
> # First let's merge all of the NARR data together > all_data<-merge(NARR_temp_data, NARR_wind_data, by="timecode", all=FALSE) > # Now merge in the ISD data > all_data<-merge(all_data,isd_trimmed, by="timecode", all=FALSE)
aggregate(...)
function. Use the code below as a template to compare the daily means for air temperature and sea-surface temperature. What do you observe? Is one measurement more in agreement than the other? If so, speculate on why that might be.
# sample aggregate function. daily_temp_means<-aggregate(cbind(T2m,temperature)~format(all_data$timecode, "%m-%d"), data=all_data, FUN=mean, na.rm=TRUE) # Note how this function works... # cbind(... lists the columns to compute # ~format(... gets a list of month-date (this is the "aggregate by" variable) # data=... this is the data frame to aggregate # FUN=... the function to apply (you can also try "max" or "min") # na.rm... removes any NA's from the aggregating process
daily_windsp_means
dataset. Make a scatter plot comparing model and observed daily mean wind speeds and comment about what you observe. Can you think of a reason for the difference (hint: consider where the model is computing the wind speed versus where it is observed). How might you correct for this difference?