International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
ASSESSING THE SIGNIFICANCE OF HYPERION SPECTRAL BANDS IN FOREST
CLASSIFICATION
G. J. Newnham®, D. Lazaridis?, N. C. Sims?, A. P. Robinson?, D. S. Culvenor?
a CSIRO Division of Land and Water and Sustainable Agriculture Flagship, Clayton South, Victoria, Australia
Department of Mathematics and Statistics, University of Melbourne, Parkville, Victoria, Australia
KEYWORDS: Forest, Classification, Hyperspectral, Ensemble, Decision Tree, Random Forests
ABSTRACT:
The classification of vegetation in hyperspectral image scenes presents some challenges due to high band autocorrelations and
problems dealing with many predictor variables. The Random Forests classification method is based on an ensemble of decision
trees and attempts to address these issues by dealing with only a subset of image bands in each node of each decision tree. Random
Forests has previously been used for classification of vegetation using hyperspectral data. However, the variable importance measure
that is a by-product of the technique has largely been ignored. In this study we investigate the spectral qualities of variable
importance in the classification of forest and non-forest in a single Hyperion scene. The spectral importance curve showed broad
bands of importance over wavelength regions known to be significant in biochemical absorption.
1. INTRODUCTION
Certain biological and statistical challenges can inhibit the
successful use of hyperspectral data for mapping forest extent.
Absorption by plant materials in vivo generally occur as broad
wavelength bands leading to auto-correlation in vegetation
reflectance spectra. In addition, many statistical modelling
methods have a tendency to over-fit to noise in cases with many
predictor variables (Bajcsy and Groves, 2004). Consequently,
classification accuracy may be highest when only a small a
subset of predictor variables is used (Hughes, 1968).
The ensemble decision tree approach described as Random
Forests (Breiman, 2001) is suited to addressing these challenges
and has been shown to be superior to linear, quadratic and
penalised discriminant analysis when using hyperspectral
satellite data (Everingham et al., 2007; Sluiter and Pebesma,
2010). Random Forests models also generate a measure of
variable importance. High variable importance has been used
for selecting narrow bands (Chan and Paelinckx, 2008) and
spectral indices (Ismail and Mutanga, 2010) for inclusion in
refined classification models. However, the spectral
characteristics of variable importance have not been fully
explored.
We consider variable importance for a classification of forests
and non-forests based on a Hyperion image over high value
forest site in Tasmania. Spectral characteristics of the
importance curve are compared to known absorption and
reflectance characteristics of leaf biochemicals.
2. METHODS
The Hyperion scene used in this study was captured on the p^
of March 2010 over the Warra Long Term Ecological Research
(LTER) site in southern Tasmania (Brown et al., 2001). The
image was 88km in the along track direction and included
mainly forested land in the south, while grassland and pasture
dominated in the north. Pre-processing was performed using the
methods described by (Datt et al., 2003) and then registered to a
orthocorrected mosaic of Landsat Thematic Mapper images
produced as part of the Australian National Carbon Accounting
System (Furby, 2002).
A Tasmanian Government state-wide vegetation map was used
for training and validation of the classification models. The map
is based on aerial photo interpretation and field validation, and
includes 154 classes as described by Harris and Kitchener
(2005). These classes were aggregated into generic forest and
non-forest classes and a raster map created on the same grid as
the Hyperion image.
First, we applied the implementation of Random Forests by
Liaw and Wiener (2002) to discriminate forest from non-forest
classes in the Hyperion image. For each class, 10000 pixels
were selected at random as the training set. In each model run,
1000 decision trees were generated. Classification accuracy was
assessed across the entire Hyperion scene. The wavelength
regions that best discriminate forest from non-forest classes
were inferred from the variable importance spectrum. These
wavelengths were then compared to published biochemical
absorption features to examine which parameters of forest
biochemistry may be contributing to the spectral separation of
forested from non-forested areas.
3. RESULTS
The classifications of the Hyperion image were assessed in
terms of overall accuracy and the Kappa statistic (Cohen, 1960).
These are summarised in Table 1. Training accuracy was
comparable to other published results. Interestingly, when the
model was applied to all pixels in the Hyperion scene, the
overall accuracy was maintained and the kappa statistic
increased slightly. This is not a large increase, but does indicate
the stability of the model when applied outside the original data
on which it was built.
The significance of Hyperion spectral bands in discriminating
the forest and non-forest classes were assessed using the
measure of variable importance produced using the Random
Forests method. The plot of variable importance as a function of
wavelength showed strong auto-correlation, with dominant
peaks in significant biochemical absorption regions.