In: Wagner W., Szekely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, IAPRS, Vol. XXXVIII, Part 7B
Page 4 of 6
599
3. If the validation area only contains missing or no
data, discard the area and jump back to step 2 above.
4. If the new validation area partially overlaps an
existing validation area, then replace the overlap with
missing data in the new validation area
5. Compute the fraction of the area within the Oppegard
map coverage (fop pe g&rd), within the Lorenskog map
coverage (f L0 rcnskog), outside map coverage (f 0s io), and
with no or missing data (f No data)- These four fractions
should sum to 1.
6. Add the map fractions to the counters, for example,
Noppeg&rdO"^i) — Noppeg&rd (0 T foppeg&rd(i^~l)» where i
and i+1 denote iterations i and i+1, respectively.
7. Continue, by jumping back to step 2 above, until all
three counters are above predefined thresholds
Moppeg&rd> MbBrenskog and MqsIo-
The Quickbird image size, (x max , y max ) = (28090, 36602 ), and
the validation area size (x size , y size ) = (1000, 1000). The
validation thresholds are M 0s i 0 = Mop pegM = M L0renskog = 2.
Initially, we intended to have M 0s i 0 much higher, but the
manual editing was so time-consuming that we ended up with
Mosio 2.
4.1.2 Validation of automatic classification
For each validation area, make a copy which is then edited, as
described below. The difference between the validation area
and the edited version is then used to compute a confusion
matrix, counting the number and type of misclassification.
Although the editing is object-based, see below, the counts in
the confusion matrix are pixel-based.
For each validation area, the classified image is compared with
the original image and an aerial orthophoto with 0.5 m
resolution or 0.1 m resolution (Oppegard, Figure 4). All
obvious misclassifications are corrected. The editing is mainly
object-based, that is, individual pixels are not edited. The
classified image has quite rugged object boundaries, many
which could have been cleaned by using road and building
outlines as a guide in the segmentation process. Noting this, we
have, to some extent, avoided editing these rugged boundaries.
On some occasions, however, what should have been two or
more objects have by mistake been segmented into one object
only. In such cases, the object has been split and parts of it
reclassified in the editing process.
On some occasions, parts of water bodies have been mistaken
as grey areas, probably due to wind patterns. Since water bodies
can be easily removed by using GIS data, we have not counted
these as misclassifications, but regarded them as missing/no
data.
Although originally intended, a validation of the three
subclasses of green areas is not performed. Only a few
occasional substitutions of one subclass of green with another
are done.
During the manual verification, the need for a gravel subclass
emerged. This class has been used in some instances to denote
grey areas that are not sealed, and thus may be recovered as
green areas. This is indeed the case for construction sites.
Typically, when a new house is being built, the entire garden
looks like a grey area in the Quickbird image, but is planted
shortly after. In practice it is difficult to see the difference
between gravel, asphalt and concrete, so the gravel class is only
used in very obvious occasions. It is in practice a subclass of
grey areas.
5. VALIDATION RESULTS
The manual validation procedure, as described in section 4, was
applied, resulting in 6 validation areas. Of these, two were from
Oppegard, two from Lorenskog, and two from Oslo. The overall
classification performance is about 89% correct classification
rate (Table 2). This figure hides the fact that the object
boundaries from the segmentation step are far from ideal.
Further, in the manual validation procedure, almost no objects
from one of the three green structure classes were reclassified
as another green structure class. In this respect, it is more
meaningful to look at the two-class problem: green versus grey
areas. In this case, the recognition performance was slightly
better, about 91% (Table 3)
Table 2. Classification performance when using six classes.
correct classification
89.13%
misclassification
10.87%
total
100.00%
Table 3. Classification performance when using two classes.
correct classification
91.38%
misclassification
8.62%
total
100.00%
Table 4. Combined confusion matrix for all six verification areas, in
number of pixels.
Area 1-6
Edited
Sum
classified
Grass
Forest
Little vegt.
Grey area
Gravel
No data
Grass
535353
0
1
4479
110
1
539944
T3
Forest
931
2737263
4568
110921
2013
8650
2855696
«E
Little vegt.
59
3164
499868
135870
3387
432
642348
ro
Grey area
3029
65620
178704
1575587
126162
3256
1949102
О
Gravel
0
0
0
0
0
0
0
No data
0
558
0
13
0
1
571
Sum edited
539372
2806605
683141
1826870
131672
12340
6000000
Table 5. Combined confusion matrix, in percentages.
Edited
Grass
Forest
Little vegt.
Grey area
Gravel
No data
Classified
Grass
99.25%
0.00%
0.00%
0.25%
0.08%
0.01%
Forest
0.17%
97.53%
0.67%
6.07%
1.53%
70.10%
Little vegt.
0.01%
0.11%
73.17%
7.44%
2.57%
3.50%
Grey area
0.56%
2.34%
26.16%
86.25%
95.82%
26.39%
Gravel
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
No data
0.00%
0.02%
0.00%
0.00%
0.00%
0.01%
Sum edited
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
The most common misclassification is to confuse little
vegetation and grey areas. This resulted in about 300,000 pixels
being reclassified (Table 4). This is about 5% of the 6,000,000
image pixels. Of the 683.141 pixels that were regarded as little
vegetation after the manual validation step, 178,704, or 26%,
were originally classified as grey area (Table 4 - Table 5).
6. DISCUSSION
The classification results show that the classification part of the
automatic algorithm is able to classify between green and grey
areas, with approximately 10% misclassification. This is clearly
a good starting point for improvements. However, the