The word metadata, when taken quite literally, means data about data. Although the term metadata is relatively new, people who work with information have collected and used metadata for quite some time. One example of a non-digital form of metadata is a library card catalogue that documents particular attributes (e.g., the title, year of publication, author, etc.) about each publication contained in the library. Today, metadata are usually found in digital form, and computer scientists have even created a computer language, XML (extensible markup language), that can be used to create metadata (i.e., the language can be used to describe a particular dataset).
Within geographic information science, there are four important reasons to create and use metadata (Longely et al. 2001):
- Organize content for searching for data in an archive
- Aid users in deciding if the data characteristics are appropriate for the their intended use of the data
- Effectively handle data (i.e., import data into a particular software program, transform data from one format into another or combine multiple data sets)
- Describe the dataset’s contents
Metadata often contain information that help users answer questions about a dataset such as:
- How were the data compiled and created?
- For what purpose were the data created?
- When were the data created?
- What is the spatial and/or temporal extent of the data?
- Who collected the data?
- How accurate are the data and what level of uncertainty is associated with the data?
- Are there limitations on access to the data (e.g., copyright or other restrictions)?
- What is the content of the data (e.g., definition of classes or categories used in the data)?
One challenge that metadata creators often face is that different implementations of metadata may work more or less effectively for each of these purposes. This has led to the development of a number of different metadata standards (i.e., descriptions of what the metadata for a particular dataset should contain). For example, in the United States, the Federal Geographic Data Committee (FGDC) has created a metadata standard called CSDGM (Content Standards for Digital Geospatial Metadata). This standard is quite comprehensive, and tries to cover all of the properties of geospatial datasets that may potentially be important. Metadata that conform to this standard are quite time-consuming and expensive to prepare. Other organizations, such as the Dublin Core, have created standards that are based on the minimum amount of information that is necessary for describing a dataset. This type of data standard is sometimes also called light metadata. While a parsimonious standard such as the Dublin Core standard might be sufficient for searching for data in an archive, it probably does not include enough detail to allow a user to determine whether the dataset’s characteristics are appropriate for its intended use once a potentially relevant dataset is identified. You can see the difference in the level of detail included in each of these standards by comparing metadata developed for one particular dataset with each standard.
FGDC standard metadata document. Pennsylvania groundwater monitoring points.
Dublin Core standard metadata document. Pennsylvania groundwater monitoring points.
Although we do not discuss the content of particular standards in detail here, if you are interested in learning more about some of the different standards in use worldwide, see the Kim (1999) paper for a comparison of some of the major standards.
If you are interested in investigating this subject further, I recommend the following:
- Kim, Tschangho John. 1999. "Metadata for geo-spatial data sharing: A comparative analysis." The Annals of Regional Science. 33: 171-81.
- Best Practices in Creating Metadata. The Inter-university Consortium for Political and Social Research. http://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/chapter3docs.html
- Guide to Writing Readme Style Metadata. http://data.research.cornell.edu/content/readme
- Spatial Metadata Samples. National Spatial Data Infrastructure. http://homepages.together.net/~bspatial/duck/samples.htm