Esri, a leading geographic information systems software and service provider, developed the Shapefile format in the 1990s in order to have a native digital vector format for spatial data within the then-popular ArcView software (ArcInfo and ArcGIS are Esri’s present-day counterparts, which also rely on Shapefiles). Unlike many other common file formats, such as .jpeg, .png, .mp3 or .html, Shapefiles (.shp) are a special kind of file that is actually a package that contains several other files. Every valid Shapefile must contain at least three other files: the main.shp file containing the coordinate data, a .shx file that contains index information, and a .dbf file, which is the dBASE database table that stores all the attribute data for the particular shape.
For example, if we were interested in mapping county-wide statistics for the entire US, we might download a file called Counties.shp. Within Counties.shp we would expect to find:
- Counties.shp – this file contains coordinate information for drawing the shapes of the counties.
- Counties.shx – contains an index of all the information within the file, speeding up many computer operations that occur with the file
- Counties.dbf – the database of all the counties, including their names, unique IDs, and other statistical information of interest. DBF files are also a popular format for database software that does not include mapping capabilities. DBF files can also be read and saved by many spreadsheet applications, such as the open source LibreOffice suite and Open Office, and very old versions of Microsoft Excel.
Many Shapefiles also include an optional .prj file that indicates the appropriate projection to be used by the mapping software when drawing the Shapefile.
Although the Esri Shapefile format is proprietary and developed by a corporate entity, it is an open format and is supported by many open source GIS tools and mapping software. For this reason, the Shapefile format is now considered a de facto standard for spatial data. MAF/TIGER is distributed in the Shapefile format, and has been since 2007.
A single Shapefile data set can contain one of three types of spatial data primitives, or features – points, lines, or polygons (areas). These features and their counterparts in the MAF/TIGER database will be covered in the next section.