GEOG 863
GIS Mashups for Geospatial Professionals

Building a MySQL Database

PrintPrint

In this part of the lesson, you'll learn methods for creating and populating tables in a MySQL database. Creating tables can be done using a web-based application called phpMyAdmin or a desktop application called MySQL Workbench. I will provide instructions for both applications.

After defining your tables, you'll download a special driver that will enable you to connect to the MySQL database using MS-Access. Using Access as a "front end," you'll have a user-friendly GUI for populating your MySQL tables and querying them.

Every step in this process could be completed using command-line SQL. (In fact, the GUI-based tools we'll be using simply generate and execute SQL code.) If you're an SQL expert, you're welcome to use SQL interfaces to complete these steps instead. My assumption is that most of you, like me, are not SQL experts and will find this methodology to be less frustrating.

MySQL Data Types

The data types used in MySQL for storing text strings are:

  • CHAR: used for columns with values of an exact length
    Ex: CHAR(2) for state abbreviations like "PA" and "NY"
  • VARCHAR: used for columns with values of variable lengths up to 255
    Ex: VARCHAR(25) for names
  • TEXT: used for free-form text such as comments of up to ~65000 characters
  • MEDIUMTEXT: free-form text of up to ~1.6 million characters
  • LONGTEXT: free-form text of up to ~4 trillion characters

The data types for storing integers are shown in the table below. As the table shows, columns of these types can be signed or unsigned. Unsigned columns can only hold positive values. Signed columns can hold both positive and negative values.

Data Types for Storing Integers
Type Signed Range Unsigned Range
TINYINT -128 to 128 0 to 255
SMALLINT -32768 to 32767 0 to 65535
MEDIUMINT -8.3M to 8.3M 0 to 16M
INT -2.1B to 2.1B 0 to 4.2B
BIGINT > ±2.1B > 4.2B

For numbers that have digits after the decimal point, there are two main data types:

  • FLOAT: used for numbers with up to 7 significant digits
  • DOUBLE: used for numbers with up to 16 significant digits

The data type limits above are approximate in many cases. To find more specific limits and to see other supported data types such as those dealing with dates and times, refer to the MySQL documentation.

Defining Tables

For this Jen and Barry's demonstration project, we're going to create three tables:


candidate_cities
Name Type
name VARCHAR(25)
population MEDIUMINT
x DOUBLE
y DOUBLE

counties
Name Type
name VARCHAR(25)
workforce MEDIUMINT
farms

SMALLINT

pop_density SMALLINT

county_geom
Name Type
vid MEDIUMINT
name VARCHAR(25)
part TINYINT
x

DOUBLE

y DOUBLE

You can define these tables using either phpMyAdmin or the MySQL Administrator:

  • Using phpMyAdmin
    To connect to your database:
    1. In a web browser, go to phpMyAdmin.
    2. Enter the Username and Password found in the file named dbinfo at the root of your personal web space (e.g., I can view my file at http://www.personal.psu.edu/jed124/dbinfo). Your database username is the second item in the file; your password is the third.
    3. After you've successfully logged in, you need to select which database you want to work with. To do this, click on the Databases button at the top of the screen, then on the database that's named the same as your PSU access account ID.

    To create a table:
    1. When entering a database page, you’ll see a section of the page labeled Create new table on database <your access account ID>. In this case, enter candidate_cities in the Name box and 4 in the Number of fields box. Then click the Go button on the right side of the window.
    2. You should see 4 rows of textboxes, one for each field you said you wanted to create. Enter name in the first Field box.
    3. From the Type drop-down list, select VARCHAR.
    4. In the Length/Values box, enter 25.
    5. Repeat the previous 3 steps to add the population, x and y columns to the table (keeping in mind that numeric fields don’t require a “Length/Values” setting).
    6. After you've entered all of the field settings, click the Save button in the lower right of the window.
    7. Click on the Database link at the top of the page to create more tables.
    8. Repeat the steps above to add the other two tables (counties and county_geom) to your database.
  • Using MySQL Workbench
    To install MySQL Workbench:
    1. Browse to MySQL Workbench.
    2. Near the bottom of the page, click the Download button next to Windows (x86, 32-bit) MSI Installer.
    3. Create a MySQL.com account if you like or click No thanks, just take me to the downloads!
    4. Click the HTTP link of the server nearest you to begin download.
    5. Run the downloaded .msi file and follow the on-screen directions.

    To connect to your database:
    1. Open MySQL Administrator (Start > MySQL > MySQL Workbench).
    2. Under the SQL Development heading, click the New Connection option.
    3. In the Setup New Connection dialog, enter a Connection Name of your choosing (such as PSU PHPDB).
    4. Enter the Hostname: instructdb3.ait.psu.edu.
    5. Enter the Username found in the file named dbinfo at the root of your personal web space (e.g., I can view my file at http://www.personal.psu.edu/jed124/dbinfo) and click OK. Your database username is the second item in the file; your password is the third. The Setup New Connection dialog will disappear and you should see your connection listed in the box under the SQL Development heading.
    6. Click the Open Connection to Start Querying option just above the box. A Connect to Database dialog should appear.
    7. Select the connection you just created from the Stored Connection list. The Hostname and Username boxes should be populated with your connection parameters automatically.
    8. Click OK and then enter the password found in your dbinfo file when prompted. You should see a schema (i.e., a database) with your user ID and beneath it a list of Tables, Views and Routines (each list being empty at this point).

    To create a table:
    1. Click the Create a New Table button just above the large schema heading near the middle of the window. (It is the 3rd of 5 buttons.) A new_table dialog will appear with a number of tabs running along the bottom.
    2. Under the Table tab, give the table the name candidate_cities.
    3. Switch to the Columns tab where you should see a cursor flashing in a text box beneath the Column Name heading.
    4. Enter name in the text box (this column will hold the names of the cities), then hit Tab or Enter.
    5. Set the column's Datatype to VARCHAR(25).
    6. Double-click the empty space under the column you just defined to begin defining a new column.
    7. Repeat steps 3-5 to add the other three candidate_cities columns.
    8. Repeat the steps above to add the other two tables (counties and county_geom) to your database.
    9. Right-click on the idnew_table column that was defined by default and select Delete Selected Columns.
    10. Click Apply, then Apply SQL and Finish to create the table.
    11. When done creating tables, click Close to dismiss the new_table dialog.

Although we didn't need them in this scenario, you may have noticed the checkboxes to the right of the Datatype heading. I won't go through all of them, but the ones that I've used most often are PK (to declare a column as a primary key), NN (to disallow Null values in a column), AI (to automatically increment the value in that column by 1 when a new record is added), and Default (to specify the default value assigned to the column when a new record is added).

Later in the lesson, we'll populate these tables. As you might have guessed, name is the key field in the two county tables that will allow us to join the attribute information with the geometries. If you were to build a real mashup in this sort of scenario, I would recommend using a column of shorter length (preferably a numeric column, like the U.S. Census Bureau's FIPS) as the key field. When we populate the county_geom table with the vertices of the features in the counties shapefile, it will have over 5600 records (107 vertices for Thompson County, 44 for Viceroy County, etc.) Each vertex record will contain the name of its parent county. This is a needless waste of space, and probably more importantly, table joins and sorts based on long character strings such as these perform more slowly than those based on numeric IDs. The shapefile didn't include a shorter ID and I don't think it's worth the effort to add one, so just keep in mind that we're not following a best practice here.

Installing the MySQL ODBC driver

This part of the process involves downloading and installing a driver that will enable you to view MySQL data within a Microsoft product, such as Access.

  1. Download the appropriate driver at: http://dev.mysql.com/downloads/connector/odbc/. Getting the right driver/combination of drivers has been tricky for some folks in the past. If you're on a 64-bit machine, I suggest starting with the 64-bit driver. If the connection steps below don't work, try installing the 32-bit driver too (leaving the 64-bit driver installed). If you're on a 32-bit machine, I think the 32-bit driver by itself should work.
  2. Under Windows downloads, click Pick a mirror.
  3. Create a MySQL.com account if you like or click No thanks, just take me to the downloads!
  4. Click the HTTP link of the server nearest you to begin download.
  5. Run the downloaded .msi file and follow the on-screen directions.

Setting up an ODBC connection

In this part of the tutorial, you'll specify parameters that can be used by Access to connect to your MySQL database.

  1. Open the ODBC Data Source Administrator by going to:
    • Start > Control Panel > Administrative Tools > Data Sources (XP).
    • Start > Control Panel > System and Maintenance > Administrative Tools > Data Sources (Vista).
    • Start > Control Panel > System and Security > Administrative Tools > ODBC Data Sources (Windows 7/8).
  2. Under the User DSN tab, click Add.
  3. In the Create New Data Source dialog, select the MySQL ODBC 5.x DRIVER (should be near the bottom of the list) and click Finish.
  4. In the resulting dialog, make the following settings:
    • Data Source Name: mysql_<your PSU access account ID> (or any name of your choice)
    • TCP/IP Server: instructdb3.ait.psu.edu
    • User: found in your dbinfo file
    • Password: found in your dbinfo file
  5. Select your database from the Database drop-down list. If you receive an error message, double-check your settings.
  6. Click OK to save the connection settings. You should see your data source listed under the User DSN tab and can dismiss the ODBC Data Source Administrator window and close the Administrative Tools window.

Linking to your MySQL tables in MS-Access

  • In Access 2003:
    1. Open Microsoft Access and select File > New.
    2. In the New File dialog box, select Blank Database.
    3. Specify a name (like mysql_jenbarry.mdb) and location for your database and click Create.
    4. With the new Access database opened, select Tables from the Objects list.
    5. Click New, then Link Table.
    6. In the Files of type field, select ODBC Databases.
    7. Select your ODBC connection under the File Data Source tab and click OK.
    8. Enter your MySQL User ID and Password and click OK.
    9. Select the three tables created in MySQL above and click OK.
      You'll be prompted to select a primary key for each table by picking a column (or combination of columns) that uniquely identifies each record. Note that you must select a primary key if you plan to add or modify data in the table. If you will not be altering the table's records, you can just click OK.
    10. Select NAME as the unique identifier for the candidate_cities and counties tables, and VID for the county_geom table.
  • In Access 2007:
    1. Open Microsoft Access, and from the Getting Started window select Blank Database.
    2. In the Blank Database pane, specify a name (like mysql_jenbarry.accdb) and location for your database and click Create.
    3. The new database will open with a new blank table called Table1. Close this table.
    4. Click the External Data tab, then the More drop-down list in the grouping of buttons labeled Import.
    5. From the drop-down list, select ODBC Database.
    6. In the Get External Data dialog box, select Link to the data source... and click OK. This option will enable you to modify the data in the MySQL tables utilizing Access, whereas the Import the source data... option would create an unlinked snapshot of the MySQL data.
    7. Select your ODBC connection under the File Data Source tab and click OK.
    8. Select the three tables created above and click OK.
      You'll be prompted to select a primary key for each table by picking a column (or combination of columns) that uniquely identifies each record. Note that you must select a primary key if you plan to add or modify data in the table. If you will not be altering the table's records, you can just click OK.
    9. Select NAME as the unique identifier for the candidate_cities and counties tables, and VID for the county_geom table.
  • In Access 2010:
    1. Open Microsoft Access, and from the Getting Started window select Blank Database.
    2. In the Blank Database pane, specify a name (like mysql_jenbarry.accdb) and location for your database and click Create.
    3. The new database will open with a new blank table called Table1. Close this table.
    4. Click the External Data tab, then ODBC Database.
    5. In the Get External Data dialog box, select Link to the data source... and click OK. This option will enable you to modify the data in the MySQL tables utilizing Access, whereas the Import the source data... option would create an unlinked snapshot of the MySQL data.
    6. Select your ODBC connection under the File Data Source tab and click OK.
    7. Select the three tables created above and click OK.
      You'll be prompted to select a primary key for each table by picking a column (or combination of columns) that uniquely identifies each record. Note that you must select a primary key if you plan to add or modify data in the table. If you will not be altering the table's records, you can just click OK.
    8. Select NAME as the unique identifier for the candidate_cities and counties tables, and VID for the county_geom table.

Populating a table through manual entry

The Jen and Barry's candidate_cities shapefile (the same dataset used in Lesson 3) contains only four records and just a few attributes, so its data can be entered manually.

  1. Load the shapefile into ArcMap and open its attribute table.
  2. Add two new Double fields to the attribute table named X and Y.
  3. Right-click on the X field header and select Calculate Geometry.
  4. Choose X Coordinate of Point from the Property list and confirm that the values will be in the GCS/NAD27 coordinate system and in Decimal Degree units. Click OK.
  5. Repeat this step to populate the Y field.
  6. Leaving ArcMap and the attribute table open, return to your Access database and open the candidate_cities table.
  7. Manually enter the values for the four candidate cities. You may notice a slight delay after entering a value and tabbing into the next cell. This is because the values must be transferred across the Internet to your MySQL database hosted at PSU, instead of being stored on a local disk.

Adding the X and Y coordinate values to the attribute table as outlined above is my recommended strategy when you are looking to transfer the coordinates of point features to a database. In the next section, we'll see how to insert records into a MySQL table in bulk, rather than manually. Later, we'll see how to transfer the coordinates of line or polygon features to a database.

Populating a table using an SQL query

The query writing tutorial you completed earlier in the lesson had you dealing exclusively with SELECT queries. We're now going to add the county feature attributes to the database by importing a copy of the shapefile's DBF table and writing what Access calls an Append query to add values from selected columns in the local counties table to the MySQL counties table.

  • Access 2003:
    1. In Access, click New in the database window, then select Import Table and click OK.
    2. In the Files of type drop-down list, select any of the dBASE (dbf) options.
    3. Browse to the location of the Jen and Barry's counties shapefile, select the counties.dbf table and click Import. After the table has been successfully imported, click Close.
    4. Double-click on the counties table to open it and confirm that the values imported correctly. Note that the icon beside the counties table in the database window indicates a table whose values are stored locally as part of the .mdb file, whereas the icon beside the tables added earlier indicates a linked table whose values are stored in some other database.
    5. Now, select Queries from the Objects list.
    6. Double-click on Create query in Design view.
    7. In the Show Tables dialog, double-click on counties to add that table to the query, then click Close.
    8. Just to the left of the Run button (exclamation point icon) is a pull-down menu. Select Append Query from this menu. Upon making this selection, you're asked to specify which table you want to append values to.
    9. Select the linked MySQL counties table from the Table Name drop down list and click OK. Note that an Append To row is added to the design grid between the Sort and Criteria rows.
    10. Double-click on the NAME field from the local counties table to add that field to the design grid. Note that NAME is also automatically entered as the Append To value. What's happened here is that Access scanned through the MySQL counties table for a field by the same name. It found one, so it made the assumption that you wanted to append the values to that field.
    11. Double-click on the AGE_18_64 field from the local counties table to add that field to the design grid. Note that in this case, nothing appears in the Append To area of the design grid because there is no field by that name in the MySQL table.
    12. In the design grid, click in the Append To cell under the AGE_18_64 column and then select WORKFORCE from the drop-down list.
    13. Repeat the step above to specify that you want to append values from the NO_FARMS87 and POP_SQMILE fields in the local counties table to the FARMS and POP_DENSITY fields in the MySQL counties table.
    14. Click the Run button to execute the query. You should receive a confirmation message asking if you really want to append 43 records. Click Yes.
    15. Open the MySQL counties table and confirm that the values were appended correctly. Once satisfied that your MySQL counties table has been populated properly, the local counties table is no longer needed.
    16. Go back to Tables in the database window, right-click on the local counties table and select Delete.
  • Access 2007/2010:
    1. In Access, click the External Data tab, then click on the More drop-down menu in the grouping of controls labeled Import. Select dBASE File from that menu.
    2. In the Get External Data dialog, click the Browse button to locate the counties.dbf file, then click OK to import the data. After the table has been successfully imported, click Close.
    3. Double-click on the counties table in the Navigation Pane to open it and confirm that the values imported correctly. Note that the icon beside the counties table in the Navigation Pane indicates a table whose values are stored locally as part of the .mdb file, whereas the icon beside the tables added earlier indicates a linked table whose values are stored in some other database.
    4. Now, create a new query by clicking the Create tab and the Query Design button.
    5. In the Show Tables dialog, double-click on counties to add that table to the query, then click Close.
    6. In the group of controls labeled Query Type, click the Append button. Upon making this selection, you're asked to specify which table you want to append values to.
    7. Select the linked MySQL counties table from the Table Name drop down list and click OK. Note that an Append To row is added to the design grid between the Sort and Criteria rows.
    8. Double-click on the NAME field from the local counties table to add that field to the design grid. Note that name is also automatically entered as the Append To value. What's happened here is that Access scanned through the MySQL counties table for a field by the same name. It found one, so it made the assumption that you wanted to append the values to that field.
    9. Double-click on the AGE_18_64 field from the local counties table to add that field to the design grid. Note that in this case, nothing appears in the Append To area of the design grid because there is no field by that name in the MySQL table.
    10. In the design grid, click in the Append To cell under the AGE_18_64 column and then select WORKFORCE from the drop-down list.
    11. Repeat the step above to specify that you want to append values from the NO_FARMS87 and POP_SQMILE fields in the local counties table to the FARMS and POP_DENSITY fields in the MySQL counties table.
    12. Click the Run button to execute the query. You should receive a confirmation message asking if you really want to append 43 records. Click Yes.
    13. Open the MySQL counties table and confirm that the values were appended correctly. Once satisfied that your MySQL counties table has been populated properly, the local counties table is no longer needed.
    14. Right-click on the local copy of the counties table in the Navigation Pane and select Delete.

Note

Look at the SQL code generated by the GUI and note that what Access refers to as an Append query is actually an INSERT query in standard SQL syntax. Though we didn't do it in this tutorial, other attributes of polygon features that you might want to consider storing in a mashup database are the centroid coordinates of the polygons. These values can be added to the attribute table using the same ArcMap Calculate Geometry command utilized above. Those polygon centroid coordinates could then be used to label the polygons with custom markers in lieu of using a third-party labeling extension.

Transferring line/polygon coordinates to a database

Unlike point features that have a 1:1 relationship between their geometries and their attribute table records, line and polygon features have multiple x/y coordinate pairs (vertices) associated with each attribute table record. Thus, the Calculate Geometry method outlined above cannot be used to transfer line/polygon geometries to a database.

In an earlier lesson, you used a custom tool written in Python to export the coordinates of features to an XML file. We're now going to use the other tool in that toolbox to export those coordinates to a comma-separated-values (.csv) text file. This csv file can then be imported into Access and its values appended to the appropriate MySQL table, just as we did in the previous section with the county attributes.

  1. In ArcToolbox, open the Mashup Tools toolbox and run the Export Shapefile to CSV tool.
  2. For the Layer parameter, browse to the Jen and Barry's counties shapefile.
  3. For the ID Field parameter, select the NAME field.

    Note

    Unlike the XML tool, the CSV tool only allows for the selection of one field instead of many. This is based on the assumption that other attributes can be added to the database by importing the attribute table itself as described in the previous section.

  4. Click OK to perform the export, then click Close to dismiss the results dialog.
  5. Import the CSV file into your Access database as outlined above. (Keep in mind that you're importing a Text File rather than a dBASE File.)
  6. Again following the steps laid out in the previous section, append the coordinate records from the local table to the linked MySQL county_geom table.

    Note

    The coordinate fields in the CSV file will be labeled Lat and Lon. Be sure you match these fields up correctly with the fields in the MySQL table: Lat = Y, Lon = X. The file also includes a Part value. This is required for multi-part polygon features (such as the state of Hawaii). None of the counties in the Jen and Barry's scenario have multiple parts, so this value will be 1 for every record.

With that, you've completed the first step in constructing a database-driven mashup. The next step is to write a server-side script to extract the necessary data from the database and feed it to a JavaScript program much like what you've already written. The PHP language is often used for reading data from a database, so that is what we'll be using. The next part of the lesson talks about PHP generically and gives you some practice writing basic PHP scripts. Later in the lesson, we'll come back to this database and see how PHP can be used to retrieve data from it.