GEOG 488
Acquiring and Integrating Geospatial Data

Creating Metadata

PrintPrint

Module 2: State College Borough Water Authority (cont.)

Part I: Creating Metadata

A. Download the Lesson Data

My story:
If you're like me and you hear the word metadata, you say, "I know, I know. They're daaaaata about daaaaata." I was at the ESRI User conference this year and went to a session about metadata. The presenter said, "For those of you who are properly documenting your data - both of you..." He was speaking to a decent sized crowd and got a good laugh. It does seem to be a running joke that everyone pays a lot of lip service to metadata but then doesn't take the time to create, complete, or update them. In this lesson, we'll briefly go over the importance of metadata (as if you haven't heard it enough). You will soon learn to appreciate good metadata when you start to combine data from other people, or colleagues that have left.

We will use one metadata editor to fill in a few fields about one of the Centre County layers provided in Lesson 1 and another editor to complete the metadata for the shapefile you created in lesson 1 from the CAD data. I had a conversation with the engineer for the water authority and explained that in order for the data he shared to be really worthwhile, I needed to be able to provide as much documentation about them as possible. I have to admit, it is a very hard task to track down metadata. The engineer for the water authority didn't have any kind of digital metadata, but was able to answer most of the questions in order for us to document the data. Other datasets I've acquired have been quite a different story. I won't name names, but one person I talked to regarding a dataset wrote, "You really think people produce metadata? That costs money and then I would be over budget and then I would need to bill to non-billable time and then I don't even want to know what would happen." This was said in a joking manner, but it is a real issue. What most people do not seem to understand is that over the life of a GIS system the cost of metadata is very small compared to the costs of misused data. Once the GIS cycle moves beyond the first adopters and the corporate memory is lost, it will make some data of dubious quality and open the corporation to possible liabilities.

Registered Students download from ANGEL the Lesson 2 data (lesson2files.zip) to a new folder (e.g., C:\MGIS\GEOG488\Lesson2).

B. Read an Overview about Metadata in ArcGIS

Metadata is the official language for the geospatial community. Unfortunately, it is common to acquire data that aren't documented properly, which therefore makes it hard to communicate with that data provider. Another common occurrence is to create your own data and scrimp on documentation because you know when the data were created, who created them, etc. and you don't want to take the time to document that information. This makes it very hard to use or share these data. Often, the key components necessary for documentation can be tracked down, but that is often as time consuming as the acquisition. I hope that this lesson will show that the initial investment in tracking down or creating metadata is worth the time it takes to be a good data steward; one who can communicate well with others in the community.

Source: ArcGIS Desktop Help

Good documentation protects your investment in the resources you have created or purchased. Many details about your data such as source, publication date, quality, and spatial reference are important if you plan to make decisions based on them. When creating your own data, documenting them enables you to share with others or contribute to a portal, such as geodata.gov. Whether internal to your organization or available on the Internet, portals let others search, find, and access the GIS resources they need.

Standards:
Following a well-known metadata standard is a good idea because tools already exist with which you can create metadata. If you plan to publish your metadata to a large audience, following a standard will also make it easier for people from different communities, industries, and countries to understand the documentation because the standard acts as a dictionary, defining terminology and the expected values. The number of metadata content standards are now converging on ISO 19139 that was adopted in 2007. This move should help eliminate some of the barriers to sharing data worldwide. There is now a way to crosswalk the FGDC's Content Standard for Digital Geospatial Metadata (CSDGM) to ISO19139. This standard aims to provide a complete description of a data source, and in XML (Extendable Markup Language), so that it is machine readable.

Because so many standards exist in legacy data, metadata in ArcCatalog isn't required to meet any specific standard. However, your organization may be required to follow a particular standard. For example, U.S. Federal Government funded projects are mandated to provide metadata following the FGDC standard, which is now adopting the ISO metadata format.

Documentation:
As with any project, you should have a plan outlining what content your metadata should include. Look over the metadata standard to get an idea of the content it suggests. Then, decide what pieces of documentation are most important to your organization based on how you intend to use it. If you plan to publish your metadata so that people can find your resources by doing a search, you need to consider what information needs to be present to do different types of searches. Typically, people will search by keywords describing the theme or subject of the resource, by the type of resource, by how current it is, and depending on the portal, by the publisher of the resource. Keywords are more useful if they are derived from a thesaurus; otherwise, you might be searching for "roads" while the data you're looking for is described as "streets". If the spatial extent of the resource is present, then people can find it if they include an area of interest in their search; for most GIS portals, the extent must be provided in decimal degrees. If a data source's coordinate system has been defined, ArcCatalog will automatically record its extent in decimal degrees. If an item, such as a geoprocessing tool, isn't specific to a place, defining its extent as covering the entire globe will ensure that it will be found by any spatial search.

Once someone has found information about a resource, it needs to be determined if this information will work for his or her application. Several pieces of information may be used to determine if the resource is appropriate. Include a description, its age, its cost, and any legal information specifying how the resource may be used. For data sources, you should also include its accuracy, its scale or resolution, and descriptions of its attributes. Information must also be provided about how to get the resource if someone wants to use it. Some pieces of information may also be required by the GIS catalog portal to which you will be publishing your metadata. These requirements are unrelated to whatever standards the portal supports. For example, if the portal lets people search using some predefined queries, you may be required to provide information in your metadata for those queries to work. The metadata librarian who runs the portal may give you a list of keywords and a thesaurus, then require at least one of the keywords in your metadata to be derived from that list.

Templates:
There are ways to reduce the documentation effort without sacrificing its quality. An ideal solution is to create a metadata template containing documentation that is the same for a group of resources. A template can be a standalone XML document that you import before adding more documentation; this effectively lets you copy information from one document to another. A template could include contact information, the publication date, and legal restrictions, but it should not include properties that will be added and maintained automatically by ArcCatalog. With ArcCatalog maintaining the item's properties and a template for adding repetitive documentation, the metadata author is left to focus on documentation that is specific to the individual resource, such as the quality of the data.

Creating a template from scratch can be difficult because it is an abstract process. An easier way to approach this is to document a resource as best you can, then use the Tools in the ArcToolbox Conversion Tools Metadata section to export only the documentation. Then, print out the documentation. Cross out any information that is specific to the resource, and think about how you can modify the rest so that it will apply to other resources as well. Then, make your edits to the template XML document-its information should be the same for all the resources that you have to document. You may need several templates to use with resources created by different departments or for different projects.

Similarly, you could create one document representing a series of resources. Suppose the data for a region is broken down into several tiles. You could create metadata describing one of those tiles, then use the Export Metadata Multiple exporter to copy its documentation to a standalone XML document. There are shortcuts to these Tools in ArcCatalog as a Toll Bar unless the interface has been customized. Select the XML document in the Catalog tree, and use a metadata editor to modify its content so the properties and documentation reflect the series of tiles rather than a specific tile. When you're done, you can publish the standalone XML document to the appropriate GIS catalog portal.

C. Use the Metadata Editor

Before you start writing documentation, you need to decide which metadata standard you're going to follow. If you don't have any metadata yet and don't need to create metadata according to a specific standard, the Esri format might suffice; otherwise, the ISO format might be right for you. If you have a requirement to create FGDC metadata, if you already have FGDC metadata, or if you want to create detailed metadata, the FGDC format would be a good choice. Once you've decided which metadata standard you're going to follow, set that editor that corresponds to that standard as the default editor.This is set in ArcCatalog Options / Metadata Tab / Metadata Style. See Figure 2.1, below.

 

There is a metadata editor provided with ArcCatalog. It lets you create complete documentation in Esri XML format. This is a machine readable format. This is then used to produce the FGDC's Content Standard for Digital Geospatial Metadata or the ISO standard 19115, Geographic Information-Metadata. Other formats might be available in specific localized versions of the XML converter. The human readable standard Metadata is achieved with Exporting Metadata Tool or it is visible in ArcCatalog Description Tab.

Use the Standard Editor in FGDC Mode

The metadata editor lets you create a complete metadata document for the selected item in the Catalog tree that is compliant with the Default Geospatial Metadata. This is the default metadata editor as set in the ArcCatalog options above. This editor also lets you enter values for some Esri-defined elements, which are specified by the Esri Profile of the Content Standard for Digital Geospatial Metadata; this document is available as a white paper from Esri's Online Support Center.

Your organization may be required to create FGDC metadata; for example, U.S. federal government departments and state and local agencies that receive federal funds to create any data must also have FGDC metadata created with it. You may also need to create FGDC metadata to publish information about your data to the GeoData.gov portal or the NSDI Geospatial Data Clearinghouse. If you already have FGDC metadata, continue to maintain your metadata in this format.

In lesson 1 you viewed some Centre County data. All four Centre County layers have spatial metadata, but only three of the four layers have other metadata fields. The following information is available for the parcels_cl layer.

  1. Unzip the files for lesson 2.
  2. Browse the metadata for the streets_cl, building_cl, and bldgpts_cl, and parcels_cl shapefiles using the Description Tab in ArcCatalog.
  3. Open the metadata_parcels_cl.txt file. It will open in txt editor and you can see the extra information.
  4. Make sure that the ArcCatalog Option for Metadata is set to FGDC CSDGM Metadata in the Customize Drop Down Menu.
  5. The FGDC Metadata editor is now the default. Scroll down and you can see the metadata under the description.
  6. Click on the Edit metadata button edit metadata button on the Metadata toolbar.
  7. Populate the few fields in the FGDC Editor with the values from the txt file.
  8. These are very incomplete metadata. Make a note of what fields are empty that would be important to have. This will help you compile a list of questions for anyone who provides you with data.
  9. That is it done with FGDC Metadata.

Use the Standard Editor in ISO Mode

The ISO 19115 metadata standard defines a core set of elements that includes a few mandatory elements and an additional set of highly recommended ones. Beyond the core set of elements, the ISO standard defines a large number of elements that can be used to thoroughly document GIS resources. This editor supports the core 19115 metadata elements that let you add documentation. Some of the core elements are supported by ArcCatalog but don't appear in the editor. When metadata is automatically created, the ISO synchronizer adds the item's properties to the appropriate ISO metadata elements. This editor generally doesn't let you edit properties of an item, only its documentation. For example, if ArcCatalog can automatically calculate the item's extent in decimal degrees, the extent page won't be editable. For more information about support for ISO metadata, see the 'Metadata standards' section of Writing documentation.

You can use different editors, one at a time, to document your data. The editor selected in the Options dialog box will appear when you click the Edit Metadata button on the Metadata toolbar. A metadata document in ArcCatalog can contain both FGDC and ISO content. These two standards can exist in parallel in the same metadata document because they each use a completely different set of XML tags to store their information. Therefore, if you provide a title using the FGDC editor and you later switch to the ISO editor, the information you previously added won't appear. Because metadata for coverages, shapefiles, and other file-based data sources is stored as XML files on disk, you can also use XML editors or other applications to edit its contents outside ArcCatalog (but it easy to destroy it, too, this way). Similarly, you can also use ArcCatalog metadata editors to edit standalone XML documents. For example, you might do this to create a metadata template; it might include standard information, such as how to purchase the data or whom to contact for more information. When editing metadata in a geodatabase, the original record in the GDB_UserMetadata table is deleted, and a new record is added with the updated metadata. To purge the deleted records from the personal geodatabase and decrease its size, right-click the database in ArcCatalog and click Compact Database.

  1. In order to use see the the ISO Metadata Wizard, you need to change the default metadata editor as above and choose the ISO Metadata in ArcCatalog Options.
  2. Browse to your roads layer and click the Description tab.
  3. Scroll down and you will see that there is the ISO Metadata now visible.
  4. Open the metadata_roads.txt file as before.
  5. Click on the Edit metadata button edit metadata button on the Metadata toolbar.
  6. This enables you to document your data to meet the ISO Metadata standard.
  7. Remember, the metadata you have refer to the CAD data from Gwin, Dobson & Foreman. Figure out how to change the metadata to reflect the fact that you created a shapefile from the data.

D. Metadata Utilities

In ArcGIS 10.x there are a number of very useful Metadata Utilities. If you want you can optionally have a look at these.

The first is a Validation tool This will check each of the headings for the information it should contain if the Metadata is complete and accurate. If you pass the validation you can then export the data. The next useful tool is the import tool that enables you to replicate Metadata between similar datasets (Do not forget to Update It After Import). These tools are available in the Toolbar under the Description Tab. In the actual ArcToolbox are lots of other tools that use scripts to synchronize Metadata, Import it Export it, Publish it, and to Translate it from one form to another. You might validate, export and publish Metadata in different formats FGDC and ISO for use by different communities or user bases. Finally, you might want to create Metadata for Data as it was manipulated so you have a record of the changes, manipulations or modification the data went through, e.g., suppose you generalized a line set to display better at a smaller scale.

Optional Exercise

There is a stand alone editor that is very useful from the EPA that is fully compatible with ArcGIS 10.x; it is worth taking a look EPA Metadata Tool here

E. Metadata and Timeliness

Look at the date fields in the metadata. It is here that you will find the details of how new are the records, when they were last updated, and the interval of revisions. Some data are always current or nearly so, e.g. real-time traffic flow meters. Other data may be very old and out of date. This has enormous potential to spoil GIS analysis, and at worst, cause GIS legal liabilities. If you are looking at a phenomenon that does not change, like mountains, that is one thing, but for the water board they best know where houses are and when they are built. Should the water board be the party responsible for updating building plot lineage or should they get this from the county? How often should they get the data? What data was available at the time of a decision? Was it the most up to date? if not, why not? The answers to these decisions could result in an error becoming an act of negligence, e.g., an error in the answer to Penn One Call. If this resulted in the death of a worker for a contractor, this could have dire legal implications.

Metadata and current metadata can be the source of the information that will protect the GIS person from such liability.

That's it for Part I!

You have just completed Part I of this module, which involved editing some metadata for some data acquired from my local area. In Part II, you will begin to assess the metadata situation for your final project area. You will also look on the web for metadata in clearing houses.