GEOG 863
GIS Mashups for Geospatial Professionals

Reading Shapefiles Using JavaScript Libraries

PrintPrint

Way back in 1998, Esri published a technical description of their shapefile data format. That description makes it possible for third parties to develop shapefile readers and conversion algorithms. One such reader that I recently came across was developed by Mano Marks, a member of the Google Geo development team. Mano's JavaScript-based shapefileloader project relies heavily on two object classes, SHPParser (which parses the geometry information) and DBFParser (which parses the tabular information).

You could probably dissect Mano's shp_load.html sample and figure out how to use his Parser classes to build your own maps. However, I thought I would try to make that job a bit easier with a couple of my own examples (adapted from his) along with explanations of how they work.

Loading a single point shapefile

The first shapefile loader example reads data from the all-too-familiar Jen and Barry's candidate_cities shapefile.  Let's have a look at its source code to see how it works.

One of the first things you should note in the source code is the referencing of two JavaScript libraries -- shp.js and dbf.js. These libraries house the SHPParser and DBFParser classes mentioned above.

Important

You don't necessarily have to create your own shapefile-based maps since this lesson's assignment allows you to choose a different data format. However, if you plan on using the Parser examples discussed here, I strongly recommend you download my copies of the libraries here. (I found a bug in the copies hosted on the Google site and also added some code that is needed for the second example below to work, which is why I'm suggesting you use mine.)

The createMarker() function is basically the same as we saw in the previous lesson, which brings us to the initMap() function. After creation of the Map object, comes the part where the shp file and dbf file are opened by their respective parsers using the load() method. Each parser's load() method accepts the name of a file, a callback function that will process the data after it's been read in, and a function that will be executed if an error is encountered. The data processing callback for the shp file is shpLoad(); for the dbf file it is dbfLoad().

Looking at these two Load() functions, they are basically mirror images of one another. In shpLoad(), the geometry data read in from the shp file can be accessed through the sh variable. In dbfLoad(), the tabular data read in from the dbf file can be accessed through the db variable. Those two pieces of data are assigned to the variables shp and dbf, which were declared with global scope near the top of the document, so that they can be accessed easily by multiple functions. After that assignment, each function has an identical check to see if the two files have both been parsed. When that happens, it is time to execute the render() function.

The first four lines of the render() function are concerned with zooming in on the area covered by the shapefile. This is done by constructing a LatLngBounds object using the maxX, maxY, minX and minY properties of the object returned by the SHPParser. That LatLngBounds object is then passed to the Map's fitBounds() method.

Now we're ready to loop through the shapefile's geometries. The geometries can be accessed as an array by reading the records property. A loop running from 0 to the length of the records array is set up. Within the loop, the expression shp.records[i].shape retrieves the geometry of the record at position i in the array. The geometry is stored in the variable shape. The latitude and longitude of the point geometry are then obtained using the expressions shape.content.y and shape.content.x. These values are passed to the LatLng constructor to create an object of that class, stored in the variable pt.

The DBFParser is set up similarly, in that the object it returns has a records property for getting the array of tabular data. So, the expression dbf.records[i] returns the tabular record at position i, which is stored in the variable dbfRecord. That variable itself holds an array of values associated with row i of the table. The array is associative, which means specifying the desired column name can access a value in the array. For this map, I was interested in only the city name and population for the marker's info window.

Finally, the end of the document contains two Error() functions, which, as you'll recall, were listed when the two Parsers were invoked. These functions output a very basic error message to the JavaScript console in the event there is a problem reading one of the files.

Note

Only the .shp and .dbf files are needed by the Parsers. Other components of a shapefile that are used by Esri software need not be uploaded to the web server for a mashup like this.

Closing other info windows before opening a new one

In testing the map above, you may have noticed that other info windows remain open when when clicking on markers. While this behavior may be desirable in certain applications, it is probably not how a typical map should operate. To ensure that only one info window is open at any one time, follow these steps:

  1. Save the page to your own machine so that you can edit it.
  2. Remove the var infoWindowOpts and var infoWindow statements from the createMarker() function.
  3. Move to the very beginning of the script element containing your JS code (just above the createMarker() function) and declare a new global variable:

    var infoWindow;

    Note

    Recall from the w3schools tutorial that declaring a variable outside of any particular function gives it a global scope, making it available to any function in the applicable script element.

  4. Now move to the top of the initMap() function and create a new InfoWindow object without supplying an InfoWindowOptions object to the class constructor:

    infoWindow = new google.maps.InfoWindow();
    

    Our approach here will be to create a single InfoWindow object and re-use it throughout the life of the map. Because we're not creating a new InfoWindow on each call to createMarker(), we won't be defining the contents of the InfoWindow using an InfoWindowOptions object as before. Instead, we'll call the setContent() method on the existing infoWindow global variable. Before resetting the info window contents, we'll call its close() method to ensure the previously opened window is closed.

  5. Inside the addListener() method's callback function and just before the infoWindow.open statement, add the following code:

    infoWindow.close();
    infoWindow.setContent(info);
  6. Save and test your document. Consult this working example if you run into problems.

Loading multiple shapefiles of different geometry types

This first example was relatively simple in that it dealt with a single point shapefile. The second shapefile loader example ups the ante by handling multiple shapefiles of different geometry types.

In testing the behavior of this map, you may notice that it is difficult to click on an interstate feature because of the presence of the counties polygon layer. If you zoom in to the interstate feature in the NW part of the map that ends near Butler, PA, you should see that it extends a bit beyond the counties layer. Clicking on that part of the feature should yield the interstate data.

Challenge

If you have any ideas on how to make the interstates clickable, feel free to share them in the Lesson 5 Discussion Forum.

Diving into the source code, start by having a look at the initMap() function. Instead of a shpfile variable holding the name of a single shapefile, this version has a shpfiles variable that holds an array of names. (The page is built to handle any number of shapefiles, performance issues notwithstanding.) A loop is then used to load each of the shapefiles using the same Parsers discussed above.

As before, the load() statements pass control to callback functions named shpLoad() and dbfLoad(). However, these versions of the callback functions have to deal with an unlimited number of shapefiles, not just one. This is handled by adding each parsed shp or dbf file to an array (arrShp and arrDbf). A check is then done to see if the lengths of the two arrays matches the length of the names array. If they do, then all of the files have been parsed and the render() function can be called.

Looking at the render() function, it now needs to determine the lat/long bounds of all of the specified shapefiles combined, not just of one. To do this, a LatLngBounds() object is created and stored in the bounds variable. A loop is then used to process each of the shapefiles. On each pass through the loop, a LatLngBounds object representing the bounding box for just the current shapefile is constructed and then unioned with the all-shapefile bounding box (held in bounds). So, because the shapefiles were specified in the order candidate_cities - interstates - counties, bounds starts out relatively small on the first pass through the loop, expands quite a bit on the second pass, and finally expands slightly more on the last pass.

As in the previous example, a loop is used to process all of the shapefile records. In this case, that loop (the j loop) is embedded within the i loop, which iterates through all of the specified shapefiles.

The previous map loaded only a point shapefile, so it was hard-coded to create only Marker objects. This map needs to handle points, lines and polygons, so a check has been added to determine what kind of geometry the current shapefile holds. The switch construct used here may seem foreign to you unless you're an experienced programmer. It is basically an alternative to using if-else if and in this situation causes one block of code to be executed when shape.type is 1 (point data), another block when shape.type is 3 (line data), and another block when shape.type is 5 (polygon data).

As we saw above, retrieving the lat and long values from a parsed point shapefile can be done using the expressions shape.content.y and shape.content.x.

When dealing with a line shapefile, each geometry is composed of a sequence of vertices, not just one point. These vertices are accessed using the expression shape.content.points. The vertices returned by the points property are immediately passed to a function called pathToArray(), which turns them into an array of LatLng objects. With that array constructed, it is used to set the path property of a new Polyline object.

Polygon shapefiles are complicated by the fact that polygon features can be composed of multiple parts. Thus, the polygon rendering code begins by obtaining the parts of the current feature. If the length of the parts array is 1 (i.e., the feature has only 1 part), then all of the vertices can be passed to the pathToArray() function to convert them to LatLng objects. However, a loop is used (the k loop) to put together the right set of vertices for each part. I have to confess though, that I can't explain what's going on in the shape.content.points.subarray(2 * parts[k], 2 * parts[k + 1]) expression. The subarray() method must be grabbing a subset of vertices, but I haven't the slightest idea why the vertex arrays are being multiplied by 2. In any case, you don't need to understand all of the code to put it to work for yourself.

Two more parts of this example could use some further explanation. Note that after the switch block, the data associated with record j in the dbf file is passed to a function called recordHtmlContent(). That function iterates through each of the keys (which correspond to columns) in the data, putting each key and its associated value on a separate line.

That bit of HTML returned by recordHtmlContent() is then passed to a function called handle_clicks(), which displays the HTML in an InfoWindow.

With that, you've seen how overlay data can be read in from a shapefile.  Seeing the distribution of features on the map is great, but the user may also like to have the ability to search for a feature by its name or some other attribute and find out where it is located on the map. Providing that ability is the subject of the next section.