Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

  
ISPRS Commission II, Vol.34, Part 3A „Photogrammetric Computer Vision‘, Graz, 2002 
  
A MACHINE LEARNING APPROACH TO BUILDING RECOGNITION 
IN AERIAL PHOTOGRAPHS 
C.J. Bellman"" 
, M.R. Shortis? 
‘Department of Geospatial Science, RMIT University, Melbourne 3000, Australia - Chris.Bellman@rmit.edu.au 
“Department of Geomatics, University of Melbourne, Parkville 3052, Australia - M.Shortis@unimelb.edu.au 
Commission III, Working Group III/4 
KEY WORDS: Building detection, Learning, Classification, Multiresolution 
ABSTRACT: 
Object recognition and extraction have been of considerable research interest in digital photogrammetry for many years. As a result, 
many conventional tasks have been successfully automated but, despite some advances, the automatic extraction of buildings remains 
an open research question. Machine learning techniques have received little attention from the photogrammetric community in their 
search for methods of object extraction. While these techniques cannot provide all the answers, they do offer some potential benefits 
in the early stages of visual processing. This paper presents the results of an investigation into the use of machine learning in the form 
of a support vector machine. The images are characterized using wavelet analysis to provide multi-resolution data for the machine 
learning phase. 
l. INTRODUCTION 
The advent of digital imagery has resulted in the automation of 
many traditional photogrammetric tasks. 
However, the automatic extraction of man-made features such 
as building and roads is far from solved. These objects are 
attractive for automatic extraction, as they have distinct 
characteristics such as parallelism and orthogonality that can be 
used in the processing of symbolic image descriptions. Despite 
an extensive research effort, the problem remains poorly 
understood (Schenk, 2000). 
Object extraction from digital images consists of two main 
tasks: 
e identification of a feature, which involves image 
interpretation and feature classification and, 
e tracking the feature precisely by determining its 
outline or centreline. 
(Agouris et. al., 1998) 
Although many algorithms have been developed, none could 
claim to be fully automated. Most rely on some form of operator 
guidance to determine areas of interest or providing seed points 
on features. 
This paper addresses the issue of determining areas of interest 
(candidate patches) using machine learning techniques. 
Most photogrammetric applications for building recognition 
have followed the principle, established by Marr (Marr, 1982), 
that there are three levels of visual information processing. The 
first, low-level processing, involves the extraction of features in 
the image such as edges, points, and blobs that appear as some 
form of discontinuity in the image. 
Intermediate-level processing involves the grouping and 
connection of these image primitives based on some measure of 
similarity or geometry. This forms the primal sketch (Marr, 
* Corresponding author 
1982) and is the basis for testing object hypotheses against rules 
that describe object characteristics. Many approaches are 
possible for establishing these rules such as semantic modelling 
(Stilla & Michaelsen, 1997), similarity measures (Henricsson, 
1996), perceptual organisation (Sarkar & Boyer, 1993) or 
topology (Gruen & Dan, 1997). 
High-level processing usually involves extracting information 
associated with an object that is not directly apparent in the 
image (Ullman, 1996,pg 4). This could be determining what the 
object is (recognition), or establishing its exact shape and size 
(reconstruction). In computer vision, recognition is the most 
common problem pursued. In photogrammetry, reconstruction 
of the geometry of features is more typically required. 
1.1 Candidate regions 
Despite the advances that have occurred in automated object 
extraction, most photogrammetric applications require some 
form of operator assistance to establish candidate image regions 
for potential object extraction. This is usually necessary to 
reduce the search space and make the problem tractable. Low- 
level processing strategies such as edge detection create a large 
number of artefacts that the mid-level grouping strategies find 
difficult to resolve. 
This problem cannot be solved simply by segmentation, as this 
is difficult for an aerial image (Nevatia et. al., 1998). An image 
contains many objects, only some of which should be modelled. 
The objects of interest may be partially occluded, poorly 
illuminated or have significant variations in texture. 
In the case of building extraction, Henricsson (1996) solves the 
candidate problem in a pragmatic way. Rather than finding 
candidate regions using a computational process, the operator 
identifies candidate regions of the same building in multiple 
images. The computer system then extracts the edge features, 
groups these based on several measures of similarity and 
computes a 3-dimensional reconstruction of the building. 
Gulct 
Extra 
over ¢ 
image 
acqui 
systel 
seed 
extrac 
In so! 
provi 
an in 
used 
infor 
the p 
the re 
does 
The 
(Zimi 
(Sche 
of int 
to the 
Ther: 
proce 
in m: 
lengt 
our L 
proce 
Macl 
netw 
for i 
appli 
phot 
appli 
acqu 
landı 
been 
stere 
(Isra 
The 
class 
basis 
smal 
empl 
node 
imag 
coml 
stage 
dom: 
such 
(Gro 
1998 
Wav 
(Rab 
char: 
resol 
Such 
reco,
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.