In: Wagner W., Szekely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, IAPRS, Vol. XXXVIII, Part 7B
635
with similar shapes and spectral properties. Finally, the means
and standard deviations of the geometric and spectral variables
were calculated for each segment.
3.3 Tree cover
The extraction of the area covered by trees is required for the
area-wide mapping of the classified tree species. Tree cover and
non-tree area masks were generated as described in detail in
Waser et al. (2008). Briefly summarized: First, digital canopy
height models (CHM) were produced subtracting the LiDAR
DTM from the three DSMs. In a second step, pixels with CHM
values > 3 m were used to extract potential tree areas according
to the definition in the Swiss NFI (Brassel and Lischke, 2001).
In a third step, non-tree objects, e.g. buildings, rocks, and
artifacts were removed using spectral information from the
ADS40-SH40 and ADS40-SH52 RGB images (low IHS pixel
values) as well as information (curvature) about the image
segments (e.g. segments on buildings have lower curvature
values and ranges than trees or large shrubs). These four steps
resulted in three canopy covers providing sunlit tree area for
each study area.
3.4 Classification of tree species
3.4.1 Evaluation of modelling procedures: Image segments
representing single trees were to be assigned to classes (species)
by predictive modelling. The classes were given by a field
sample from the 7 dominant tree species of the study area as
described in section 2.2. As the response variable has more than
two possible states, a multinomial model had to be applied. The
logistic regression model is a special case of the generalized
linear model (GLM) and described in e.g. McCullagh and
Nelder (1983). Combination of logistic models was
implemented by fitting a binomial logistic regression model to
each class (species) separately and assigning the respective
segment to the species with the highest probability. For details
on the logistic regression function with quadratic terms see e.g.
Hosmer and Lemeshow (2000). The explanatory variables as
given in section 3.1 were used.
In a first run, a single classification was performed using each
set of variables separately. Then the explanatory variables from
both the 2007 May and July images were tested together within
a logistic regression model since the same flight path was used
and the shadows were quite similar. Due to large differences in
the flight paths and shadows between 2007 and 2008 the 2008
data had to be used separately in a separate logistic regression
model (see Fig. 2). In total, tree species were classified four
times using different logistic regression models and input
imagery (see also table 3).
Figure 2. Example of the same area of trees acquired by
different flight paths between 2008 August (left) and 2007 July
(right) images.
3.4.2 Validation: In order to validate the predictions of tree
species, the digitized reference tree data (see section 2.2) had to
be assigned to the corresponding image segments. Since the
delineations of the field samples were not always congruent
with the automatically generated image segments each of the
230 digitized reference trees was assigned to an image segment
using the following rule: If one segment contained more than
one digitized field sample, the segment was assigned to the field
sample covering the greater part of the segment. If less than
10% of the image segment was covered by the sample polygon,
the segment was not assigned at all. The predictive power of the
models was verified by a 5-fold cross-validation. The statistical
measures used to validate the results were: producer’s (PA)- and
user’s accuracy (UA), correct classification rate (CCR), and
kappa coefficient (K).
4. RESULTS
4.1 Confusion matrices
The classification of the seven tree species was achieved semi-
automatically and, depending on the image data used, quite
high accuracies were obtained. The overall accuracies for tree
species classification obtained by the different input imagery
are summarized in table 3. The confusion matrices of the May,
July and August classifications with best CCR and K are
summarized in Tables 4-6. The classified main tree species
are: Abies alba (Aa), Picea abies (Pa), Pinus sylvestris (Ps),
Larix decidua (La), Acer sp. (Ac), Fagus sylvatica (Fs), and
Fraxinus excelsior (Fe).
Input data sets
CCR
K
05-2007
0.798
0.691
07-2007
0.668
0.598
05 and 07-2007 combined
0.691
0.632
08-2008
0.757
0.667
Table 3. Overall accuracies for four different tree species
classifications.
Table 4 shows that five of seven tree species are classified with
accuracies > 73% when using the May 2007 images. Best
agreements are obtained for Picea abies (92%) and Fagus
sylvatica (86%). The most frequent failures happen in
classifying the non-dominant tree species Acer sp. (43%), and
Larix decidua (56%) which are often misclassified either as
Fagus sylvatica or Picea abies.
May 2007 Classified as
Field
Aa
Pa
Ps
La
Ac
Fs
Fe
PA
Aa
29
2
—
—
—
—
4
0.83
Pa
—
77
—
2
2
2
1
0.92
Ps
-
~
14
—
—
—
—
0.76
La
—
12
1
10
—
—
—
0.43
Ac
1
2
-
—
19
9
3
0.56
Fs
1
3
—
—
3
55
2
0.86
Fe
3
4
—
—
2
2
54
0.83
UA
0.85
0.76
0.93
0.67
0.73
0.81
0.84
Table 4. Confusion matrix for tree species classification using
the explanatory variables from May 2007 ADS40-SH40