In: Wagner W., Szekely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, IAPRS, Vol. XXXVIII, Part 7B
278
no longer vivid) covers the ground so densely that its spectral
signature is close to bare soil. The time when the crop looks like
grass (shortly after gestation) has to be avoided by not using
images acquired during this time.
object was classified as ‘grassland’. Otherwise it will be
rejected and classified as an error in the data base. The
classification and the verification of the test objects are carried
out independently from each other.
3.3 SVM Classification and Verification of GIS Objects
4. EVALUATION
The SVM classifier is a supervised learning method used for
classification and regression. Given a set of training examples,
each marked as belonging to one of two classes, SVM training
builds a model that predicts whether a new example falls into
one class or the other. The two classes are separated by a
hyperplane in feature space so that the distance of the nearest
training sample from the hyperplane is maximised; hence, SVM
belong to the class of max-margin classifiers (Vapnik, 1998).
Since most classes are not linearly separable in feature space, a
feature space mapping is applied: the original feature space is
mapped into another space of higher dimension so that in the
transformed feature space, the classes become linearly
separable. Both training and classification basically require the
computation of inner products of the form i>(fj) T • <3>(fj), where
fi and fj are feature vectors of two samples in the original feature
space and <J>(fj) and 3>(fj) are the transformed features. These
inner products can be replaced by a Kernel function K(f h f)),
which means that the actual feature space mapping <I> is never
explicitly applied (Kernel Trick). In our application we use the
Gaussian Kernel K(i h f ; ) = exp(-^ II f, - fjll 2 ), which implies that
the transformed feature space has an infinite dimension. The
concept of SVM has been expanded to allow for outliers in the
training data to avoid overfitting. This requires a parameter v
that corresponds to the fraction of training points considered to
be outliers. Furthermore, classical SVM only can separate two
classes, and SVM do not scale well to a multi-class problem.
The most common way to tackle this problem is the one-versus-
the rest-strategy where for each class a two-class SVM
separating the training samples of this class from all other
training samples is trained, and a test sample is assigned to the
class achieving the highest vote from all these two-class
classifiers (Vapnik, 1998).
For the classification process in our approach, the SVM
algorithm needs to learn the properties of the classes to be
classified, namely the classes ‘grassland’, ‘tilled cropland’ and
‘unfilled cropland’. The training is done using a set of objects
with known class labels. The class labels are assigned to the
training objects interactively by a human operator. In a first step
a feature vector consisting of the spectral (6), textural (4) and
structural (5) features defined in Section 3.2 is determined from
the image data for all the training objects. Hence, the overall
dimension of the feature vectors is 12. Each feature is
normalised so that its value is between 0 and 1 for all training
objects. Then, the feature vectors of all segments are used to
train the three SVM classifiers required for the one-versus-the
rest strategy.
In the classification itself, the feature vector is determined for
each test object, and it is normalised using the normalisation
parameters determined in training. The object is classified using
the previously trained SVM classifiers into one of the classes
‘tilled cropland’, ‘unfilled cropland’ or ‘grassland’. However,
for the process of GIS verification, the separation between tilled
and unfilled cropland is meaningless. Hence, for the verification
process, a cropland GIS object will be accepted (and classified
as ‘correct’) if the object is classified as ‘tilled cropland’ or
‘unfilled cropland’. Otherwise it is classified as an error and
thus rejected. A grassland object is verified as correct if the
In this section, we present the evaluation of our approach using
a pan-sharpened IKONOS scene in the area of Halberstadt,
Germany, acquired on June-18, 2005 and having a ground
resolution of 1 m. The reference dataset is based on ATKIS.
However, according to the ATKIS specifications, any cropland
or grassland object may actually contain areas corresponding to
another class as long as certain area limitations are met (AdV,
2010). In this work, we assume each GIS object to correspond
to exactly one of the classes. Furthermore, both for training and
for the evaluation we have to distinguish unfilled cropland from
tilled cropland, information that is not contained in ATKIS. The
original ATKIS database was thus modified for our tests: each
ATKIS cropland or grassland object consisting of units
corresponding to different classes was split manually into
individual objects corresponding to a single class. All the
cropland objects in the resulting GIS data set were classified
manually into tilled vs. unfilled cropland according to a visual
inspection of the images. Finally, GIS objects smaller than
5000 m 2 were discarded because we cannot assume the
structural approach to work with such small objects. Of the
remaining GIS objects, less than 50% were used for training,
whereas the other objects were used for the evaluation of our
method. As the original data base did not contained any errors,
we changed the class label of about 10% of the test objects that
were chosen randomly. Figure 4 shows the test scene with
super-imposed GIS objects. The numbers of objects used for
training and evaluation as well as the number of errors added
for testing the verification approach are summarised in
Table 1.
class
training
test / errors
‘grassland’
32
89/8
‘tilled cropland’
165
223 / 23
‘unfilled cropland’
11
21/2
£
208
333/33
Table 1. Objects used in the training and test datasets.
In the training phase we fixed the maximum training error v to
v= 0.1%. The parameter yof the Gaussian Kernel was fixed at
y= 0.01. The training results were used to classify the test
objects. In order to evaluate the classification process, the
results of classification were compared to the reference. Table 2
shows the confusion matrix of the classification results, whereas
the completeness and the correctness of these classes are
presented in
Table 3.
algorithm
ref.
‘tilled
cropland’
‘unfilled
cropland’
‘grassland
£
‘tilled c.’
176
0
47
223
‘unfilled c’.
1
12
8
21
‘grassland’
3
0
86
89
£
180
12
141
333
Table 2. Confusion matrix of the test objects.
The confusion matrix in Table 2 shows that our approach does a
good job in separating tilled cropland from untilled cropland,
but the separation of both cropland classes from grassland is
very uncertain. Since tilled and untilled cropland can be