'. Istanbul 2004
f the successful
, 57 additional
d from a range
' included the
d photographs
otographs of a
hotomosaic of
scale of these
rements of the
to 256 x 256
al test data for
. Although no
fier identified
ending on the
errors occurred
ality compared
were able to
iple data and
om training is
ther work that
mate the true
eter set that is
general trends
ial kernel and
ith the linear
all produced
between these
mage content.
the second in
alization was
resolution (16
jarsest image
of these tests
| many more
mputationally
in terms of
> cost of more
s) and a very
n results are
thod of pre-
influence the
| not apparent
The training
Mbytes up to
ion level, the
e coefficients
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004
Accuracy estimates Accuracy for Number
after training Out-of-sample testing Misclassified
Test No. ofkernel | Error |Recall| Precision | Accuracy | Recall | Precision an Non- T
Number | evaluations | (<=%) |(<=%)| (<=%) (%) (%) (%) Buildings buildings Total
2-3a 13621468 14.7.1 87.2 86.7 82.9 81.5 90.8 S770 258 837
2-7a 1326497 24.4| 76.7 79.1 81.7 84.8 86.2 476 424 900
2-7b 8279892 25.31 815 75.4 87.4 88.6 91:5 358 258 616
3b 9431103 24.1]. 78.3 74.6 84.7 90.7 86.1 290 459 749
3-2a 275218851 3391 661 66.1 84.8 89.1 87.3 341 406 747
3-4b 29534759 2601 76.1 77.2. 84.7 82.4 92.9 551 198 749
3-7a 1132296 24.11 776 79.0 82.0 86.9 85.2 410 474 884
3-7b 8575675 29.9} 772 717 85.7 92.5 86.1 234 466 700
4-3a 14108050 14.7] 87.1 86.7 83.0 81.5 90.8 578 258 836
4-7a 1326937 24.4| 76.7 79:1 81.7 84.8 86.2 476 424 900
4-7b 7690877 25.3]. 81.5 75.4 87.4 88.6 91.5 358 258 616
6-1b 11131518 2394 77.8 79.3 83.5 85.3 88.4 460 350 810
6-3b 18612448 21.51 800 81.3 87.4 89.0 91.0 345 274 619
6-4b 3002719641 28.4| 74.1 75.0 87.1 85.9 93.3 442 193 635
6-7a 21478593 30.9] 74.9 71.4 84.0 89.7 85.9 324 462 786
6-7b 36570891 25.5. | 78.5 70.3 84.2 90.3 85.7 305 471 776
6-8a 117950052 18.9 | 83.4 82.9 85.0 92.7 85.1 230 508 738
Table 1. Results of classification and testing using large training sample
A general set of optimal parameters for the pre-processing of
image data and the training of the SVM is difficult to
determine. It is likely that while some general principles can be
established, fine tuning of the classification approach is data
dependent and must be reviewed on a case-by-case basis.
Based on the tests in this research, over-sampled wavelet
coefficients at a resolution of 16 x 16 appear to offer the best
trade off between classification accuracy and computational
efficiency. Combined with normalisation in the image domain
(test 3-7b), this set of parameters produced fewer errors in
classifying the buildings but at the expense of a higher false
positive rate. The classifier produced by this test also achieved
the highest recognition rates with the additional test data.
7. CONCLUSION
Machine learning methods have been used successfully in
several image processing and machine vision domains. The
research presented here extends this to building recognition for
photogrammetric applications.
An important aspect of machine learning in vision applications
is to extract a representative set of characteristics from the
image. The multi-resolution approach of wavelets achieves this
effectively and leads to a solution that is computationally
feasible. One potential limitation of the wavelet approach is that
for large training sets, the coefficient files can become very
large and unwieldy.
With sufficient training data, an effective classification model
can be obtained using a polynomial kernel with the support
vector machine. This classification model performs well in out-
of-sample testing and has a success rate of more than 80% in
correctly recognizing building image patches.
While these techniques cannot satisfy the metric requirements
of photogrammetry, they can provide useful starting points and
heuristic filters in the area of automated object extraction. With
some refinement, this method could be incorporated into a
building extraction system as a heuristic filter and be used to
ensure that only image patches with a high probability of
containing a building were passed to the algorithms that
performed the extraction.
REFERENCES
Agouris, P., Gyftakis, S. & Stefanidis, A. 1998. Using A Fuzzy
Supervisor for Object Extraction within an Integrated
Geospatial Environment. In: International Archives of
Photogrammetry and Remote Sensing. Ohio, USA,
XXXII(HI/1), pp. 191-195.
Baltsavias, E. P., Gruen, A. & Van Gool, L., Eds. 2001.
Automatic Extraction of Man-Made Objects from Aerial and
Space Images (III). Zurich, A. A. Balkema.
Bellman, C. J. & Shortis, M. R., 2002. A Machine Learning
Approach to Building Recognition in Aerial Photographs. In:
International Archives of the Photogrammetry, Remote Sensing
and Spatial Information Sciences, Graz, Austria. Vol. XXXIV
(3A), pp 50-54.
Canny, J. F., 1986. A Computational Approach to Edge
Detection. /EEE Transactions on Pattern Analysis and Machine
Intelligence 8(6), pp. 679-686.