Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

; a result, 
; remains 
y in their 
| benefits 
the form 
machine 
inst rules 
ches are 
10delling 
nricsson, 
1993) or 
ormation 
nt in the 
what the 
and size 
the most 
struction 
d object 
re some 
> regions 
ssary to 
le. Low- 
e a large 
gies find 
1, as this 
un image 
10delled. 
, poorly 
jlves the 
finding 
operator 
multiple 
features, 
rity and 
y 
>. 
ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision", Graz, 2002 
  
Gulch et. al. (1998) describe a Semi-automatic Building 
Extraction System that has undergone extensive development 
over a number of years. In this system, an operator interprets the 
image contents and automated tools assist the operator in the 
acquisition of 3-D shape data describing a building. In another 
system (Michel et. al., 1998), the operator need only provide a 
seed point within the building roof-line. The building is then 
extracted automatically using a pair of epipolar images. 
In some situations, spatial information systems can be used to 
provide existing semantic and positional data about objects in 
an image (Agouris ef. al, 1998). A set of fuzzy operators is 
used to select the relevant data and control the flow of 
information from image to spatial database. The system offers 
the potential of fully automatic updating of spatial database but 
the relies on the existence of the database in the first place. It 
does not use image data to determine regions of interest. 
The use of auxiliary data such as digital surface models 
(Zimmermann, 2000), multi-sensor and multi-spectral data 
(Schenk, 2000), provides another means of determining regions 
of interest in an image but issues of data fusion add complexity 
to the task. 
There is much evidence from cognitive science that human 
processes for shape recognition are both rapid and approximate 
in many cases. Intuitively, this suggests that complicated and 
lengthy visual processing strategies are not complete models of 
our biological vision, particularly in the early stages of visual 
processing. 
2. A MACHINE LEARNING APPROACH 
Machine learning approaches, such as those based on neural 
networks and support vector machines, are popular strategies 
for image analysis and object recognition in many imaging 
applications (Osuna ef. aL, 1997; Li et. aL, 1998). In 
photogrammetry, machine learning techniques have been 
applied to road extraction (Sing and Sowmya 1998), knowledge 
acquisition for building extraction (Englert 1998) and for 
landuse classification (Sester 1992). Neural techniques have 
been used in feature extraction (Li et al. 1998, Zhang 1996), 
stereo matching (Loung and Tan 1992) and image classification 
(Israel and Kasabov 1997). 
The recognition task is generally treated as a problem of 
classification, with the correct classifications being learnt on the 
basis of a number training examples. Where the images are 
small (i.e. have few pixels), a direct connection approach is 
employed, where each image pixel is directly connected to a 
node in the connectionist architecture. For typical aerial digital 
imagery, such an approach is not feasible due to the 
combinatorial explosion that would result. Some preprocessing 
stage is required to extract key characteristics from the image 
domain. Many of the strategies for preprocessing are available, 
such as edge detection (Canny, 1986), log-polar-forms 
(Grossberg, 1988) and texture segmentation (Lee & Schenk, 
1998). 
Wavelet analysis is often associated with image compression 
(Rabbani & Joshi, 2002) but also has useful properties for the 
characterization of images. Of particular interest are the multi- 
resolution representations that can be generated (Mallat, 1989). 
Such an approach has been used successfully in system to 
recognize the presence of a pedestrian in a video image 
(Papageorgiou et. al., 1998); (Poggio & Shelton, 1999) and for 
face recognition (Osuna et. al, 1997). There are strong 
suggestions from psycho-physical experiments that mammalian 
vision systems incorporate many of the characteristics of 
wavelet transforms (Field, 1994). 
2.1 Wavelet Processing 
Wavelet processing allows a signal to be described by its overall 
shape plus a range of details from coarse to fine (Stollnitz er. 
al., 1995). In the case of image data, wavelets provide an 
elegant means of describing the image content at varying levels 
of resolution. 
The Haar wavelet is the simplest of the wavelet functions. It is a 
step function in the range of 0-1 where the wavelet function 
W(x) is expressed as: 
1 forüs x«1/ 
Ww(x):=4—1 for12<x<1 (1) 
0 otherwise 
The wavelet transform is computed by recursively averaging 
and differencing the wavelet coefficients at each resolution. An 
excellent practical illustration of the use of wavelets is provided 
by Stollnitz et. a/.(1995). 
As a discrete wavelet transform (DWT), the Haar basis does not 
produce a dense representation of the image and is not 
sufficiently sensitive to translations of the image content. An 
extension of the Haar wavelet can be applied that introduces a 
quadruple density transform (Papageorgiou ef. al., 1998; Poggio 
& Shelton, 1999). In a conventional application of the discrete 
wavelet transform, the width of the support for the wavelet at 
level n is 2" and adjacent wavelets are separated by this 
distance. In the quadruple density transform, this separation is 
reduced to '4 2" (Figure 1(c)). This oversamples the image to 
create a rich set of basis functions that can be used to define 
object patterns. An efficient method of computing the transform 
is given in Oren et. al., (1999). 
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
| 1 
-1 1 1 1 
-1 (b) 2D Wavelet functions for 
horizontal, vertical and 
diagonal features 
(a) Haar wavelet 
from equation 1 
9 
Y A42 —N 
  
  
  
  
  
  
  
  
  
  
  
  
Standard 
  
Over-sampled 
(c) Sampling methods 
  
  
  
Figure 1: The Haar wavelet characteristics 
(after (Papageorgiou et. al., 1998)). 
  
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.