In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Voi. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009
15
The algorithm used for feature weighting was the ReliefF (Liu
and Motoda, 2008); features with highest weights are Ah, Ap
and NDVI; GBR and NIR have low weights; the goodness of
selection is also demonstrated by the obtained results varying the
set of features in the classification phase. The Weights obtained
by the ReliefF algorithm are shown in Fig. 4.
Weighting of features for selection
Features
Figure 4: Results of ReliefF algorithm applied to the set of seven
features; the n parameter represents the number of nearest in
stances from each class.
Analyzing the weight of each feature, it is evident as the LiDAR
features Ap and Ah and the NDVI have the higher values; pure
radiometric features do not allow to classify data correctly due to
the lack of spectral separability.
3.2 Thresholding Normalized DSM
Thresholding Normalized DSM is a simple technique that allows
to classify LiDAR data; only few objects can be extracted, mainly
buildings. Problems which afflict this approach are the ambiguity
of high density canopies and the impossibility to distinguish be
tween land and grass. nDSM is defined as the subtraction of the
DTM from the DSM of the same scene. A normalized DSM con
tains objects on a plane of height zero. Assuming that buildings
in the scene have a known range of height, and that the heights of
all other objects fall outside this range, buildings can be detected
by applying appropriate height thresholds to the nDSM.
3.3 AdaBoost
AdaBoost (short for ’’adaptive boosting”) is presently the most
popular boosting algorithm. The key idea of boosting is to create
an accurate strong classifier by combining a set of weak classi
fiers. A weak classifier is only required to be better than chance,
and thus can be very simple and computationally inexpensive.
Different variants of boosting, e.g. Discrete AdaBoost, Real Ada
Boost (used in this paper), and Gentle AdaBoost (Schapire and
Singer, 1999), are identical in terms of computational complex
ity, but differ in their learning algorithm. The Real AdaBoost
algorithm works as follows: each labelled training pattern x re
ceives a weight that determines its probability of being selected
for a training set for an individual component classifier. Starting
from an initial (usually uniform) distribution Dt of these weights,
the algorithm repeatedly selects the weak classifier ht (x) that re
turns the minimum error according to a given error function. If a
training pattern is accurately classified, then its chance of being
used again in a subsequent component classifier is reduced; con
versely, if the pattern is not accurately classified, then its chance
of being used again is raised. In this way, the idea of the algorithm
is to modify the distribution D t by increasing the weights of the
most difficult training examples in each iteration. The selected
weak classifier is expected to have a small classification error on
the training data. The final strong classifier H is a weighted ma
jority vote of the best T (number of iterations) weak classifiers
h t (x):
H (x) — sign ottht (x) J
It is important to notice that the complexity of the strong classi
fier depends only on the weak classifiers. The AdaBoost algo
rithm has been designed for binary classification problems. To
deal with non-binary results we used a sequence of binary clas
sifiers, where each element of such a sequence determines if an
example belongs to one specific class. If the binary classifier re
turns a positive result, the example is assumed to be correctly
classified; otherwise, it is recursively passed to the next element
in this sequence; this techniques is known as ’’one against all”.
As weak classifer in this paper, a Classification And Regression
Tree (CART) with three splits and T = 35 was used.
The CART method was proposed by (Breiman et al., 1984). CART
produces binary decision trees distinguished by two branches for
each decision node. CART recursively partitions the training data
set into subsets with similar values for the target features. The
CART algorithm grows the tree by conducting for each decision
node, an exhaustive search of all available features and all possi
ble splitting values; the optimal split is determined by applying
a well defined criteria as Gini index or others ones (Duda et ah,
2000).
3.4 Classification Results
In order to extract objects of interest from the previous described
dataset, all the data were classified. In Fig. 5, the best result
(in terms of detection rate) of classification using AdaBoost is
shown. Moreover to evaluate correctly the quality of classifica
tion, a ground truth for buildings was manually created (see Fig.
6); the ground truth for the remaining classes actually is not avail
able but it is planned to cover all the area to analyse exactly the
classifier performance.
Figure 5: Results of classification using AdaBoost and the train
ing set of 32 polygons; red stands for building, yellow for tree,
blue for land and green for grass
In Table 1 the results for building extraction with different sets
of features are highlighted; according to the weighting algorithm,