Weather and Climate Data Sets

METEO 810: Weather and Climate Data Sets


Quick Facts about METEO 810

Anticipating weather events first requires an understanding of typical (or expected) conditions at a particular site. Such climatologies are constructed primarily from historical observations but may also include numerically derived forecasts and analyses. In this course, you will learn a variety of methods for accessing appropriate weather and climate datasets available from government and research institutions. Working with very large datasets in a computationally efficient manner will be stressed, as will consideration of factors that affect data reliability. You will be encouraged to consider numerous possibilities for presenting weather and climate data with a minimum of quantitative analysis. In addition, numerous examples and case studies will augment discussions on such topics as numerical reanalysis datasets, self-describing archives, and typical problems encountered with environmental observations. Finally, you will learn to construct a site-specific or regional climatology and to communicate a qualitative analysis of those data to others.

METEO 810 is a professional, graduate-level course offered by the Department of Meteorology and Atmospheric Science. The course is designed specifically for distance learners who are interested in learning about weather and climate data sets. Note, currently Meteo 810 is listed as Meteo 897 in LionPATH. The "897" course number designates that the course has yet to be granted a permanent number at the time of its listing in LionPATH. The next offering of Meteo 810 is Spring 2018.Why is weather data important?

A recent CNBC article, “The Sexiest Job of the 21st Century: Data Analyst,” described the demand for data analytic specialists,—sometimes called data scientists, who know how to manage the tsunami of information, spot patterns within it and draw conclusions and insights—as nearing a frenzy.  This is due in part to the availability of massive data sets, now accessible to companies and government organizations for the first time due to cheap IT storage and increasing processing power.  Perhaps one of the largest sources of untapped big data is the weather.  Every day, over 6 TB of observational weather data is collected by the National Centers for Environmental Prediction. They in turn produce 1.5 TB of output in the form of 15 million operational products.

Bill Pardue, CEO of Weather Analytics, estimates that, “A third of U.S. commerce is sensitive to the weather.” (Forbes, 2013)  This has led to the growth of numerous companies (Weather Analytics, Planalytics, Weather Trends, Climate Corporation, etc) that supply weather data -- both raw and analytics – to businesses and governmental organizations. These companies boast many of the U.S. Fortune 500 companies as their clients. 

The question therefore becomes can education be provided to the thousands of data analytics professionals in these companies so they can access, analyze, and manipulate atmospheric datasets on their own.  The Department of Meteorology and Atmospheric Science at Penn State believes the answer to this question is “Yes.”  This program would be ideal for anyone who works with historical or forecast data in a weather-sensitive sector.  The positions might include Marketing and Sales Analysts, Statisticians, Business Intelligence Analysts, Risk Analysts, Logistics Managers, and IT professionals.

In “How to Get a Hot Job in Big Data” (InfoWord, 2010), Michael Dsupin, CEO of tech staffing firm Talener is quoted as saying, "Marketing and research people are becoming adept at pulling data from one system, translating it, and loading it into another system.”  Our program will teach these types of individuals to add weather data streams to their existing analysis routines.

What will you learn in this course?

METEO 810 seeks to give you a better understanding of weather and climate datasets. After successfully completing this course, you will be able to:

• Identify various sources from which to collect global weather and climate data.
• Choose weather and climate data types appropriate to a desired observation or metric.
• Manipulate large datasets in order to focus on key aspects as defined by an external problem.
• Display weather and climate data in a manner that effectively communicates answers to posed questions.
• Describe both knowable and unknowable sources of error in environmental data sets (as well as suggest solutions to combat both).
• Exhibit a global perspective on the challenges and opportunities of incorporating weather and climate information into decision-making processes over a wide range of business and governmental sectors.

The lessons that comprise this course are:

Lesson 1: Meteorological Data Collection Methods (time standards, remote vs. insitu data, surface observing systems, satellite observing systems, radar measurements of precipitation and large hail, upper-air observing systems, questions to consider when evaluating observational data)

Lesson 2: An Introduction to "R" (installation of R and R-studio software, introduction to vector arithmetic, basic R functions, importing data, basic R plots, histograms and box-plots, contour and image plots, using custom libraries)

Lesson 3: Historical Data Sets (data portals, NCEI data sets, retrieving data with RNOAA, automated data retrieval and data availability maps, the integrated surface dataset, asking the right questions when retrieving data)

Lesson 4: Data From Numerical Models (introduction to numerical forecasting, types of numerical models, retrieving data with RNOMAD, parsing the 5-dimention model dataset, introduction to ensembles, using the RNOMAD ensembles)

Lesson 5: Decoding NetCDF and WGRIB formats (retrieving and processing NetCDF and WGRIB data, finding the right library, reading headers, extracting data, introduction to the NARR and other reanalysis products)

Lesson 6: Taming Unruly Data (data problems, checking for corrupted data, dealing with missing values, introduction to data transformation, introduction to smoothing techniques (moving average, splining, lowess), down-scaling (nearest point, interpolation))

Lesson 7: Sector-Specific Data Sources (retrieving data for GIS applications, solar and wind data, Typical Meteorological Year datasets, data over the oceans, sources of foreign data)

Lesson 8: Presenting Data to Decision-Makers (asking the right questions, choosing appropriate data, display formats, presentation strategies, conveying uncertainty, handling data-related questions)

How does this course work?

As with most graduate courses, there is a considerably higher onus on you to take responsibility for your own learning. While lessons present guidance on what you need to learn, much of your actual learning will take place as you experiment with various examples presented in the text. Following through on these examples and exploring various ways to accomplish proscribed data-procurement tasks are an absolute necessity, not only to be successful on the lesson's assessment activity but to meet your own learning goals as well. We strongly recommend that all students have some experience with a programming language. In this course we will use the open-source (free) statistical programming language "R". R is fairly straightforward to learn (certainly at the level that we will start off with). However, you should be familiar with the tenants (and basic skills) of computer programming. Please check out the "Pre-Enrollment" link at the top of this page for more information.