From Meteorology to Mitigation: Understanding Global Warming

Problem Set #2


Activity: Statistical Analysis of Atlantic Tropical Cyclones and Underlying Climate Influences

NOTE: For this assignment, you will need to record your work on a word processing document. Your work must be submitted in Word (.doc or .docx) or PDF (.pdf) formats.

For this activity, you will perform an analysis of the relationship between Atlantic Tropical Cyclone (TC) counts and three potential climate-related predictors of TC activity, over the time interval 1870-2006.

Link to Linear Regression Tool


  1. First, save the Problem Set #2 Worksheet to your computer. You will use this word processing document to electronically record your work in the remaining steps.
    • Save the worksheet to your computer by right-clicking on the link above and selecting "Save link as..."
    • The worksheet is in Microsoft Word format. You can use either Word or Google Docs (free) to work on this assignment. You will submit your worksheet at the end of the activity, so it must be in Word (.doc or .docx) or PDF (.pdf) format so the instructor can open it.
    • Please show your work!  When you are explicitly asked to create plots in a question, please cut-and-paste graphics and the output from the screen (e.g., by first printing the output to a pdf file and then directly inserting into the worksheet) to submit along with your discussion and conclusions.
  2. We will start out using the "unadjusted" Tropical Cyclone (TC) counts, i.e., counts not adjusted for potential historical undercount of storms in earlier decades. Please separately perform three simple ordinary linear regressions for TC counts (Target Observation = TC_no_adjust) using the following predictors: (A) sea surface temperature (SST) in the main development region (MDR) (Model Parameters = MDR), (B) Niño 3.4 (Model Parameters = niño), and (C) NAO (Model Parameters = nao). For each of the three regressions answer the following questions.
    1. What is the correlation coefficient? Is the relationship (i.e., the sign of the correlation coefficient) consistent with your expectations based on the earlier readings in the chapter regarding the factors that influence Atlantic TC counts?
    2. Is the correlation statistically significant?
    3. Does autocorrelation of the residuals appear to be a problem? Do you see any evidence of additional structure (i.e., trends) in the residuals? What can you conclude about the robustness of this linear regression?
  3. For each of the three simple ordinary linear regressions you performed in Question 2, answer the following:
    1. How much variation in the TC count data does the predictor variable account for? Recall that R2 is a measure of the fraction of variation in the data that is captured by the predictor variable.
    2. Does the predictor capture the long-term upward trend in TC counts? Does it capture the year-to-year variability (finer up and down fluctuations from year to year) in TC counts? Does it capture multi-decadal variability (broader up and down fluctuations over several decades)?
  4. Next please perform a multivariate regression using all three predictors simultaneously (Target Observation = TC_no_adjust; Model Parameters = MDR, niño, and nao).
    1. How much of the total variation in the data does your regression model capture?
    2. Does autocorrelation of the residuals appear to be a problem?
    3. Briefly discuss why multivariate regression model is superior to the single predictor regressions from Question 2.
  5. Repeat the analyses (i) and (ii) in Question 4 using (A) "lightly adjusted" TC count series (Target Observation = TC_light_adjust), where a modest undercount bias is assumed prior to aircraft reconnaissance (i.e., an undercount of roughly 1 storm per year before 1944) and (B) "heavily adjusted" TC count series (Target Observation = TC_heavy_adjust), where a more substantial underestimation bias is assumed (i.e., roughly 3 storms per year prior to 1944). How, if at all, did your conclusions change?
  6. Using the regression model based on the "lightly adjusted" TC series, please make a prediction for the number of TCs that we should have seen during the 2019 Atlantic Hurricane season. Perform the following steps:
    1. Assume that:
      • MDR SSTs were roughly equal to the value of the MDR SSTs in 1997;
      • there was a mild El Niño event: take Niño 3.4 index to be 0.7;
      • NAO was neutral: use the long-term average of the NAO series.
      What are the values of mdr and nao (along with nino = 0.7) that you are to use in the multivariate equation?
    2. Write down the multivariate equation.  Substitute the values of MDR, niño, and nao found in (i) into the multivariate regression equation to obtain your prediction for TC count.  Round off your estimate to the nearest whole number and call the result N. For this calculation, we will assume that the 95% confidence interval is given by N +/- sqrt(N).  Calculate this 95% confidence interval for your prediction.
    3. Compare your prediction to the prediction made prior to the beginning of the 2019 Atlantic Hurricane season by the Penn State statistical model. The Penn State model predicted 10.1 +/- 3.2 storms, which corresponds to between 7 and 13 storms, with the best estimate of 10 named storms. Is your prediction consistent with the Penn State statistical model, within the statistical uncertainty?
    4. Compare your prediction to the actual observed total for 2019: 18 named storms. Is your prediction consistent with the observed counts, within the statistical uncertainty?
  7. Save your word processing document as either a Microsoft Word or PDF file.

Submitting your work

Upload your file to the "Problem Set #2" assignment in Canvas by the due date indicated in the Syllabus.

Grading Rubric

The instructor will use the general grading rubric for problem sets to grade this activity.