GEOG 583
Geospatial System Analysis and Design

Open Data Standards

Why Choose Open Data Standards?

Most open source geospatial projects are heavily invested in implementing open standards. In general, this means Open Geospatial Consortium (OGC) standards. The Open Geospatial Consortium was founded in 1994 in response to pent-up demand in the government and industry to solve the issue of spatial data sharing and interoperability. Back then, spatial data tended to be stored in proprietary formats, often giving specific GIS vendors a competitive advantage. In the 1980s and early 1990s, the process of reformatting or translating spatial data required time-consuming, expensive custom add-ons, typically from the original vendor of the system. 

Some of the key OGC standards are briefly outlined here:

  • Simple Feature - this is one of the OGC's earliest standards. It defines what a geographic feature is (at a minimum a point, line or polygon) and then sets out a common format for text and binary representations of geographic features. The "simple" refers to the lack of topology in the data structure, often called spaghetti data. This standard promotes interoperability as, if one program exports its data in either Well Known Text (WKT) or Well Known Binary (WKB), then it is easy for another program to read in the same data and know what it means.
  • Geographic Markup Language (GML) is an extension of XML schema (or grammar) for the expression of geographical features. It is used as an interoperability format for features that are too complex to express using the Simple feature standard. It is used particularly by Web Feature Service (WFS).
  • KML - was developed as a competitor to the OGC's GML, but it is now one of the more well known of the OGC's standards. It was originally developed by Keyhole (that's the K) and then popularized by Google's Google Earth application. It was donated to the OGC in 2007 to be developed as an open standard for 2 and 3D map annotation.
  • UML - This is an open and standardized way of representing programming and modeling entities, their properties and their relationships, and formulating their parameters and actions. It can be used to diagram out a programming task and some types of the software can write part of the code. It is related to Esri Model Builder and to their Model Diagrams e.g. Esri Biodiversity Conservation.
  • Web Mapping Service (WMS) - WMS was one of the first OGC standards and set the basis for all web mapping for many years.
  • Web Feature Service (WFS) - this is the standard that a service has to conform to if you want to serve geographic features over the web.
  • Web Coverage Service (WCS) - the WCS standard defines an interface and operations to access geographic coverages (rasters) over the web.
Screenshot showing types of Open Geospatial Consortium standards & their relationships to 1 another. Click on link below to explore website
OGC's interactive overview. Browse their options.

Another very important open(-ish) format is the shapefile! One of the reasons for the wild popularity of shapefiles is that Esri released the specification as an open document; you can read the technical description of the shapefile here. Esri places no restrictions on other organizations implementing shapefile readers or writers. It's likely that if they had made it a closed format, it would never have become the de facto standard for vector geospatial data.

Standards & Avoiding Vendor Lock-In

Vendor lock-in occurs when a proprietary data structure is no longer supported. This can happen in both open source and licensed software. However, the use of open standards may help minimize this problem (if not alleviate it completely). Migrating data to different applications is usually easier when the data structures are open and fully understood. Translation tools can be built, or may already be available. Platforms like FME by Safe Software are designed explicitly to facilitate this sort of data migration and interoperability. However, when a data structure is not documented, then data can end up locked-in, and may not be retrievable without considerable time and expense associated with custom programming a workaround. A best practice then is to design a GISystem to take advantage of open data standards from the start, and to know which translational tools exist before you commit to your system design assumptions. Open source can suffer from a type of lock-in, too, especially if the code isn't fully documented.