One of the first steps in any geospatial project is finding data and metadata related to your topic and study area. I like to think of this phase as detective work. You often need to search for detailed clues in many different places before you can understand the bigger picture. For example, the same data set can often be obtained from multiple agencies, in multiple formats, and in multiple geographic packages (e.g., grouped by state or county vs. seamless).
You may need to consult several different sources to find all of the information you need to use the data, such as date, scale, description of coded values, etc. You may also use different sources to pre-screen and download the data. These websites are often hyperlinked to each other, so you may bounce back and forth a few times before landing in the right spot. You may find that some interfaces and data products are much easier to work with than others. We will experiment with a few different data providers to demonstrate this concept. The keys to success are budgeting ample time, keeping detailed notes along the way, and asking the right questions before you begin your search.
The best place to start looking for geospatial data is on the web. There has been a push to democratize environmental and climate-related data, and we will take full advantage of that initiative. I have listed a few different types of websites, typical data you will find on them, and links to some example sites below. This is not meant to be an exhaustive list, but rather an overview to get you pointed in the right direction.
Most websites provide links to download raw GIS and geospatial data that you can input into spatial analyses. Shapefiles, geodatabases, GeoJSON, and rasters are typically available for download in one or more of the following options:
GIS and geospatial files from Options 2 and 3 are typically aggregated by one or more geographic units such as counties, 7.5‘ topographic quadrangles (topo quads), or watersheds. You may need to download multiple files to cover your entire study area, and then merge them into a single data set using ArcGIS. The higher-quality sites typically offer interactive maps where you can browse available GIS and geospatial data and metadata.
Several years ago, finding information in a readable format was one of the most challenging parts of geospatial work. This is no longer the case, as most government data sets have been converted into GIS and geospatial formats accessible on the Internet. Typically, government data is available in at least two different formats: raw geospatial files (e.g., shapefiles, geodatabases, rasters) and online data services. You are likely familiar with working with raw GIS data within ArcGIS Pro or using online data services such as the ArcGIS Living Atlas.
Online data services are geospatial layers that you can connect to via the Internet. One of the major benefits of online data services is that they contain seamless versions of data. Seamless data sets combine individual data sets from different locations, scales, and time periods into one dataset. This lets you view and interact with hundreds to thousands of individual data sets simultaneously. For example, you may have worked with paper versions of topographic maps in the past. Each paper map only shows a finite area (e.g., 7.5 minutes) at one scale (e.g., 1:24:000). If you want to view a larger area or a different scale (1:100,000 or 1:250,000), you would need to gather many different paper maps. Using a seamless map service, you only need to use one data product to access the information from all of these paper maps at the same time. As you zoom to different scales, the underlying data source changes automatically. For example, if you zoom out to view an entire state, the map will display scans of the 1:250,000 maps. As you zoom in closer, the images will be replaced by more and more detailed data sets (1:100,000, 1:24,000).
While seamless datasets can be extremely valuable, they also have their drawbacks. For example, many seamless data sets were created by digitally stitching together multiple adjacent data layers that were created at different time periods. Mosaicking them together into one dataset gives the impression that the metadata of the underlying data sets are uniform when they are not. You must be careful using seamless data sets if time is an important variable in your analysis. This is only a concern if the data were not collected continuously, such as via satellite. Examples of continuous data include digital elevation models and products derived from remote sensing sources such as the National Land Cover Data Set (NLCD).
You can view online data services in a variety of ways. For example, you can use viewers embedded in an organization's website, ArcGIS.com, or add them directly to your layout in ArcGIS Pro. Interactive mapping websites allow you to view and interact with online data services using any Internet browser. Sites will usually include a map viewer, legend, tools to interact with your data such as zoom and identify, and tools to download subsets of data directly from the interactive map. Interactive maps allow you to customize what is displayed on the map by turning available layers on and off in the legend. They may also enable you to view the underlying attributes of each data source.
You will find that the quality and user-friendliness of online interactive map viewers vary dramatically depending on the organization and software used to create them. For example, on some websites, the identify tool only allows you to identify features within one layer at a time. You have to specify which layer is “active” in the legend to view its attributes. On other sites, you must manually refresh the map by clicking on a button every time you turn layers on and off.
Adding online data services directly to your ArcGIS session gives you many of the benefits of interactive mapping websites while providing much more flexibility to customize your map. Depending on the type of service, your options for controlling how the data are displayed are limited. For example, you may be unable to change certain aspects of the symbology or use them for input into geoprocessing tools such as the Clip Tool. They often have scale-dependent rendering settings that you may be unable to alter. Aside from these limitations, there are many benefits to using online data services. They can save a lot of time since you don’t have to download each data set individually and set the symbology for each one. This could trim a few days from your work schedule if you use many complex data sets.
Interactive mapping websites are a great way to get to know your study area and check the availability of several data sets simultaneously, but they may lack tools for robust spatial analysis. Connecting to map services or the AGO Living Atlas within ArcGIS is an easy way to create base maps, combine data from multiple sources, or integrate your own data layers with publicly available data. Since the data come pre-symbolized, you can save a lot of time setting up your map. Working with raw data gives you the most flexibility as far as interacting with your data within ArcGIS. However, there is typically a steep learning curve in figuring out which attributes to use to symbolize your map and use for your analysis. This can become a very time-intensive exercise. It is best to download only the datasets that you need to modify or input into an analysis project and rely on online data services for the remaining data.
Once you have located and acquired your data, your job is only just beginning. Your input data will likely come from several different sources, have a variety of data formats and extents, cover a range of time periods, and include many different attributes. You need to be aware of these properties before you start to work with your data. A lot of this information is not immediately obvious just by looking at the files. You will need to locate metadata documents to figure out many of the details. You will find that the quality of metadata necessary to understand and work with data varies depending on the source. Oftentimes, official FDGC metadata files are not packaged with the data. It is also possible that the metadata will be packaged with the data but not in a format recognized by ArcGIS (e.g., PDF or Word Document). This means you won’t be able to view the metadata in ArcGIS. If metadata files are not packaged with the raw data, you can usually find the information you need somewhere on the source website, by doing a general Internet search or by contacting the agency or organization that created the data. You may need to visit several different websites to find all of the information you need to answer all of the questions below. Sometimes, one of the most time-consuming parts of an analysis project is figuring out what different fields and attribute values mean (e.g., coded or abbreviated values).
Links
[1] https://www.usgs.gov/core-science-systems/national-geospatial-program
[2] http://datagateway.nrcs.usda.gov/
[3] http://www.epa.gov/geospatial/
[4] https://earthexplorer.usgs.gov/
[5] https://www.usgs.gov/products/data
[6] https://www.usgs.gov/national-hydrography/national-hydrography-dataset
[7] https://waterdata.usgs.gov/usa/nwis/nwis
[8] http://dwtkns.com/srtm30m/
[9] https://glovis.usgs.gov/
[10] https://neo.gsfc.nasa.gov/
[11] https://sedac.ciesin.columbia.edu/
[12] http://www.pasda.psu.edu/
[13] http://www.glo.texas.gov/land/land-management/gis/
[14] https://www.opendataphilly.org/
[15] https://egis-lacounty.hub.arcgis.com/
[16] https://guides.libraries.psu.edu/c.php?g=376207&p=5296031
[17] http://guides.libraries.psu.edu/c.php?g=376207&p=5296082
[18] https://www.colorado.edu/libraries/libraries/earth-sciences-map-library/map-library-collection
[19] https://opentopography.org/
[20] https://www.indexdatabase.de/
[21] https://asf.alaska.edu/
[22] https://scihub.copernicus.eu/
[23] https://data.humdata.org/
[24] https://www.un.org/geospatial/mapsgeo
[25] https://ladsweb.modaps.eosdis.nasa.gov/view-data/
[26] https://guides.libraries.psu.edu/c.php?g=376207&p=5296088
[27] http://DIVA-GIS Free Spatial Data
[28] https://www.nrsc.gov.in/EOP_irsdata_Objective_New
[29] https://earth.esa.int/eogateway/catalog
[30] https://gis.ducks.org/
[31] http://www.naturalearthdata.com/
[32] https://audubon.maps.arcgis.com/home/index.html
[33] https://geospatial.tnc.org/
[34] http://geospatial.tnc.org/pages/data
[35] https://livingatlas.arcgis.com
[36] https://opendata.arcgis.com/about
[37] https://nwt.lternet.edu/other-niwot-datasets
[38] http://www.datacommons.psu.edu/default.html
[39] https://www.chesapeakeconservancy.org/conservation-innovation-center/high-resolution-data/lulc-data-project-2022/