GEOG 489
Advanced Python Programming for GIS

3.5.2 The lesson in more detail

PrintPrint

In this lesson, we will start to work with some software that you probably are not familiar with and we will be using Python packages extensively that we have not used before to demonstrate how a complex GIS project can be solved in Python by combining different languages and packages within a Jupyter Notebook. Therefore, it is probably a good idea to prepare you a bit with an overview of what will happen in the remainder of the lesson.

  • We already discussed the idea of using Jupyter Notebooks for data analysis projects. We will start this part of the lesson by introducing you to Jupyter Notebook and explaining to you the basic functionality (Section 3.6) so that you will be able to use it for the remainder of the lesson and future Python projects.
  • The R programming language has its roots in statistical computing but also comes with a large library of packages providing data analysis methods for many specialized areas. One such package is the ‘dismo’ package for species distribution modeling. We will use the task of generating a species distribution model for the Solanum Acaule plant species as the data analysis task for this lesson’s walkthrough with the goal of showing you how Python and R functions can be combined within a Jupyter Notebook to solve some pretty complex analysis problem. The species distribution modeling application will be discussed further together with a brief overview on R and the ‘dismo’ package in Section 3.7.
  • Using pandas for the manipulation of tabular data will be a significant part of this lesson’s walkthrough. We will use it to clean up the somewhat messy observation data available for Solanum Acaule. As a preparation, we will teach you the basics of manipulating table data with pandas in Section 3.8.
  • GDAL/OGR will be the main geospatial extension of Python that we will use in this lesson (a) to perform additional data cleaning based on spatial querying and (b) to prepare additional input data (raster data sets for different climatic variables). We, therefore, provide an overview on its functionality and typical patterns of using GDAL/OGR in Section 3.9.
  • We will mainly use the Esri ArcGIS API for Python to create an interactive map visualization within a Jupyter Notebook. However, the API has much more to offer and provides an interesting bridge between the FOSS Python Data Science ecosystem and the proprietary Esri world. We, therefore, provide an overview of the API in Section 3.10.
  • The lesson’s walkthrough in Section 3.11 will show you a solution to the task of creating a species distribution model for Solanum Acaule combining both Python and R and making use of the different Python packages introduced in the lesson. The walkthrough will be provided as a Jupyter notebook that you can download and run on your own computer.