Automated feature extraction has long been considered the Holy Grail of remote sensing. One of your esteemed instructors has even written an article on this subject: Automated Feature Extraction: The Quest for the Holy Grail [1], which you should, of course, read. The appeal of simply pushing a button and having all the features of interest in an image identified is understandably appealing. That being said, automated feature extraction requires expensive technology and highly trained people. The task should only be undertaken if the economies of scale are present. For example, developing an algorithm to count planes from one satellite image taken of an airport would be a waste of time; it would be quicker, more accurate, and more cost effective to do it manually. An algorithm that could count planes from a satellite image of any airport in the world would be very valuable, as such a task would be extremely costly, requiring a large number of human analysts.
Automated feature extraction typically requires high-quality data. Changes in an image such as shifts in the tone or the direction of shadows may have little to no effect on a human analyst but can wreak havoc on an automated approach. The success of an automated workflow often hinges on the quality of the remotely sensed data used along with the methods employed to preprocess the data. Recent work points to the advantage gained from integrating multiple types of remotely sensed data into the feature extraction process, particularly if the modalities complement each other (i.e., imagery and lidar). One of the points made in the article: Automated Feature Extraction: The Quest for the Holy Grail [1] is that one of the great advantages that the human analyst has is that he/she can perceive depth from 2D imagery due to the presence of shadows. lidar, when integrated with imagery into an automated workflow, more closely approximates the unique recognition abilities of the human analyst.
It would be impossible to cover all the approaches to automated feature extraction, as the field is changing rapidly. One of the most recent advances has been the move away from pixel-based approaches to object-based approaches, particularly when it comes to extracting features from high-resolution data. In keeping with the times, this lesson will have a focus on the application of object-based methods to automated feature extraction tasks, making full use of the elements of image interpretation you studied in the previous lesson.
In this lesson, you will learn the techniques, tools, and procedures used to automatically extract information from remotely sensed data. Although “nothing beats a human,” the reality is that due to the vast amounts of remotely sensed data being acquired and the relatively slow (and thus costly) rate at which human interpreters work, automation is necessary if we are to turn these data into actionable information. In the civilian remote sensing community, the common term for such automated approaches is "classification." "Image classification" has an entirely different meaning in the defense community, thus the term "terrain categorization" (TERCAT) was adopted. Given the confusion surrounding the term "classification" and the fact that "terrain categorization" does not adequately describe all that this segment of remote sensing comprises, we will use the term "feature extraction" instead.
At the end of this lesson, you will be able to:
If you have any questions now or at any point during this week, please feel free to post them to the Lesson 4 Questions and Comments Discussion Forum in Canvas.
NOTE: There is a critical typographic error in the textbook formula for normalized difference on page 339. There should be absolute value brackets around the numerator. The denominator should contain a plus sign, rather than a minus sign: Sa + S. Furthermore, the result should always be a positive number - a negative normalized difference would not make physical sense. There should be absolute value brackets around the numerator.
Digital image classification uses the spectral information and spatial information contained in an image, which is related to the composition or condition of the target surface. Image analysis can be performed on multispectral as well as hyperspectral imagery. It requires an understanding of the way materials and objects of interest on the earth's surface absorb, reflect, and emit radiation in the visible, near-infrared, and thermal portions of the electromagnetic spectrum.
In order to make use of image analysis results in a GIS environment, source image should be orthorectified so that the final image analysis product, whatever its format, can be overlaid with other imagery, terrain data, and other geographic data layers. Classification results are initially in raster format, but they may be generalized to polygons with further processing.
There are several core principles of image analysis that pertain specifically to the extraction of information and features from remotely sensed data.
The extraction of information from remotely sensed data is frequently accomplished using statistical pattern recognition; land-use/land-cover classification is one of the most frequently used analysis methods (Jensen, 2005). Land cover refers to the physical material present on the earth’s surface; land use refers to the type of development and activities people undertake in a particular location. The designation of “woodland” for a tree-covered area is a land cover classification; the same woodland might be designated as “recreation area” in a land use classification.
While certain aspects of digital image classification are completely automated, a human image analyst must provide significant input. There are two basic approaches to classification, supervised and unsupervised, and the type and amount of human interaction differs depending on the approach chosen.
Classification schemes may be comprised of hard, discrete categories; in other words, each pixel is assigned to one, and only one, class. Fuzzy classification schemes allow a proportional assignment of multiple classes to pixels. The entire image scene may be processed pixel-by-pixel, or the image may be decomposed into homogeneous image patches for object-oriented classification. As stated by Jensen (2005), “no pattern classification method is inherently superior to any other.” It is up to the analyst, using his/her knowledge of the problem set, the study area, the data sources, and the intended use of the results, to determine the most appropriate, efficient, time and cost-effective approach.
Measuring the accuracy of classification requires either comparison with ground truth or comparison with an independent result. Errors of omission are committed when an object is left out of its true class (a tree stand which is not classified as forest, for example); errors of commission are committed when an object that does not belong in a class is incorrectly included (in the example above, the tree stand is incorrectly classified as a wetland).
Approaches to automated feature extraction can be defined by both the unit of analysis and by the class assignment method.
For the unit of analysis, there are two main approaches, pixel- and object-based. The unit of analysis in the pixel-based approach is, of course, the individual pixel, or grid cell. The pixel-based approach was the first approach developed. It is conceptually simple and computationally faster, compared to the object-based approach. The pixel-based approach grew out of the Landsat program, in which the data can be largely characterized as spectrally rich, due to the number of bands, and spatially poor, due to the relatively low spatial resolution (~60m for the first Landsat MSS scenes). In general, pixel-based approaches work by assigning each pixel to a class (category of interest) based on the digital values of each band. The limitation of pixel-based approaches is obvious; they make use of only one of the elements of image interpretation – tone. Pixel-based approaches generally work best for moderate or coarse resolution multispectral data or with hyperspectral data.
Object-based approaches to feature extraction started gaining ground in the remote sensing community in the early part of this century. In an object-based approach, pixels are grouped together based on their spectral and spatial properties into an object through the use of a segmentation algorithm. Technically, an object could be an individual pixel, but most often it is a collection of pixels. As an object contains more than one pixel, additional attributes such as the range of values, deviation of the values, and texture can be computed. Objects are polygons, meaning that they have geometric properties, such as shape and size. Furthermore, unlike pixels, objects have topology, and thus they are aware of the surrounding objects. Because of this, object-based approaches to automated feature extraction enable one to incorporate all of the elements of image interpretation.
Pixels/objects are assigned to a particular class or category through either an unsupervised or a supervised approach. In an unsupervised approach, a clustering algorithm (e.g., ISODATA) is used to group pixels/objects with similar properties into a set number of classes. It is then up to the end user to assign meaningful category names to these classes. In a supervised approach (typically referred to as “machine learning” in other fields), training data are used to define the parameters of each class. Pixels/objects are then assigned to a class based on an algorithm. There are dozens, if not hundreds, of supervised classification algorithms. Most remote sensing packages only offer a few supervised classification options. In this lesson, we will focus on the application of rule-based expert systems. Expert systems are a form of knowledge engineering, in which the user develops rules to assign categories to pixels/objects. The advantage of rule-based expert systems is that they provide a clear and concise means by which to translate the elements of image interpretation into an automated workflow.
Object-oriented image classification involves identification of image objects, or segments, that are spatially contiguous pixels of similar texture, color, and tone (Green and Congalton, 2012). This approach allows for consideration of shape, size, and context as well as spectral content. Relationships between objects can play an important role in their identification and classification. Object-oriented methods are often more effective than pixel-based methods when classifying high-resolution imagery, because as spatial resolution increases, the more variability there may be in the spectral content of individual pixels all belonging to the same class. Considering groups of pixels as objects helps to overcome the increased complexity of a high-resolution scene due to shadows, changes in vegetation density, or the similar spectral signatures of dissimilar features (i.e., asphalt roof shingles and asphalt road surfaces).
Objects are a more powerful classification unit than pixels because they can be delineated to correspond with physical features in the landscape; pixels, on the other hand, are arbitrary "boxes" of spectral content. Object-oriented classifiers more closely mimic the elements of manual interpretation that were studied earlier in the course. Objects created by segmentation have sizes and shapes that can be quantified, as can the distance of one type of object from another (for example, a pixel containing the spectral signature may be very close to a road centerline or within a building footprint, as indicated by ancillary data.
In summary, any type of digital image classification, be it pixel-based or object-based, will be, at best, semi-automated. The success of automation is in having the computer classify the "easily predictable portions of the landscape while retaining human efforts for final editing and classifying those areas that cannot be reliably predicted." (Green and Congalton, 2012). The larger the area being studied, the greater the potential benefits and savings of semi-automated techniques. As you have seen, both in the reading and through hands-on experience, there is much up-front work and refinement that must be done by the human analyst regardless of the type of digital image processing or mathematical algorithms applied.
Unsupervised classifications are easy to carry out but they are not robust and fail to incorporate the elements of image interpretation. Pixel-based unsupervised classification approaches can perform marginally well on moderate resolution data but are wholly unsuitable for high-resolution feature extraction. Some type of post-processing is always necessary.
PRESENTER: This video will demonstrate a workflow in which a pixel-based unsupervised classification is used to map land cover from Landsat imagery. Unsupervised pixel-based classification approaches work by partitioning the input imagery into a set of output classes, based only on the spectral values. Because they're only making use of the spectral information, they are limited in their utility. They cannot make use of spatial information such as size, shape, or even texture.
One can use the Iso Cluster unsupervised classification geoprocessing tool to perform an unsupervised classification. But a more streamlined approach is to use the Image Classification Wizard. To do this, first select the image data set in the table of contents that you want to classify. And then from the Imagery tab, choose Classification Wizard.
The first step is to configure the classifier. Under Classification Method I'm choosing Unsupervised, and for the classification type I'm going to use a pixel-based. Now I already have an existing classification schema. So I'm going to load that here. If you don't have a classification schema, you can always select default and modify it later.
Once you've finished entering the configuration parameters, you can click Next. This will move to the next step in the Image Classification Wizard, which is called Train. For an unsupervised classification, the train phase is where you'll enter some of the key parameters. The most common parameter you'll want to modify is the maximum number of classes. This is the total maximum number of output classes that you can have in the resulting classification. You want this to be higher than the number of classes you need. But note that this is the maximum, and it may not be achieved.
I'm going to adjust the maximum number of classes to 10, and leave all the other defaults in place. Clicking Run will produce the initial classification. You'll want to review your output here, and you may want to go back to the previous stage and make some adjustment to the input parameters, such as adjusting the number of classes.
If you're happy with the output, simply click Run in the classified window, to move to the next phase. This will produce another raster data set with the same number of classes. And in the Classify window, you can now click Next to transition to the next phase.
In the assign class phase, I'm going to assign a land cover class to each one of the categories in my unsupervised classification. It's important to note that although classes may be spatially dissimilar, and then have unique values in the unsupervised classification, they could belong to the same land cover class.
Prior to assigning classes to the unsupervised classification, you may want to make some modifications to your classification schema. For example, you can right-click, and edit the properties of the schema, giving it a new name, and entering descriptive information.
You can make modifications to individual classes, adjusting their name, color, and default numerical value. No two classes should have the same numerical value. You can also add or remove classes from the existing schema. And then finally, if you're happy with your classification schema and are considering reusing it in the future, be sure to save it using one of the Save options.
Assigning the appropriate land cover class to each one of your unsupervised categories may take some time. You'll want to compare your unsupervised classification to at least your input imagery, and perhaps even to some other reference data that you may have access to. To associate each land cover class with its corresponding unsupervised category, you'll want to select the class in the assign class dialog, click on the Assign Class button, and then click on a pixel belonging to that class in the unsupervised classification.
The unsupervised class assignment information is updated in both the assign class dialog and in the table of contents. Clicking on a class in the assign class table, will highlight that particular class in the unclassified raster, helping you understand what pixels belong to that particular category. As was mentioned earlier, one or more unsupervised categories could have the same land cover class.
You'll want to continue this process of associating land cover classes with unsupervised categories, until all of your unsupervised categories have an associated land cover class. As I'm working with Landsat data which is at an angle, I even have a background category that I need to consider for my classification. Once you've completed the class association process, you can click Next to move to the reclassifier stage.
For an unsupervised classification, it's unlikely that you'll need to apply any reclassification routines. So you can click Run to finalize your classification. This produces the final classified raster, which is stored inside your project geo database. You have the option of removing any intermediate products from your table of contents to clean up your ArcGIS project.
Unsupervised classification is an easy and straightforward way to getting at feature extraction. However, it should generally only be applied to multispectral imagery, and in those cases where the land cover classes can be clearly separated by only spectral information.
Rule-based expert systems have the advantage in that they provide a mechanism to replicate the human cognitive process by progressively building the amount of information available for classification, enabling the use of contextual information in the feature extraction process. The rule-based approach makes it straightforward to determine the root cause of a classification error. A well-designed expert system can be reused for projects that have similar data inputs. The downside of expert systems is that it can be easy to overfit a solution. They can be time-consuming to construct and as the development of the system progresses there are diminishing returns when it comes to improvements to the accuracy. In some cases, additional rules may even reduce accuracy.
The example rule set posted on Canvas demonstrates a workflow for extracting baseball fields from aerial imagery. Baseball fields do not have a signature, rather they are a compilation of features.
Support Vector Machines (SVM) and Classification And Regression Trees (CART) are supervised machine learning sample-based approaches to classification. They are easy to implement but are somewhat of a brute force approach to feature extraction. They can incorporate many variables but fall short when it comes to incorporating contextual information. SVM and CART classifications can be coupled with expert systems to resolve classification issues.
The SVM/CART example on Canvas illustrates how to try and apply SVM/CART classifiers using point training data. The points classes are first transferred to objects, which in turn are used as the training dataset.
Deep learning approaches in remote sensing feature extraction most often take the form of supervised machine learning using convolutional neural networks. Deep learning can be very robust for extracting features that are well-defined and numerous (e.g. buildings and cars). In order for a deep learning approach to be successful numerous (sometimes tens of thousands) training samples are required. Convolutional neural networks require considerable GPU processing and often have to be run in a high-performance computing environment.
The sample convolutional neural network posted to canvas demonstrates a workflow to extracting data from a scanned topographic map. Note that your computer must have a robust GPU in order to execute the workflow.
For any feature extraction project it is rare that you will have only one dataset as your input. In many cases, there may be existing previously mapped thematic datasets that can help to inform your classification. The data fusion example posted to Canvas illustrates a number of concepts:
The example is of an area in which the building dataset is in need of updating. The goal is not to map all buildings, only those buildings that have been removed, modified, or added.
Find a project website or article that describes a project that employed an object-based approach to feature extraction. In your post, discuss if the feature extraction workflow was able to incorporate spectral, geometric, textural, and contextual elements into the workflow. Comment on at least one post aside from your own.
If you have anything you'd like to comment on, or add to the lesson materials, feel free to post your thoughts in the Lesson 4 General Questions and Comments Forum. For example, what did you have the most trouble with in this lesson? Was there anything useful here that you'd like to try in your own workplace?
Links
[1] http://www.lidarmag.com/PDF/LiDAR_Magazine_Vol1No1_Oneil-Dunne.pdf