Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
  
ISPRS Commission III, Vol.34, Part 3A „Photogrammetric Computer Vision“, Graz, 2002 
  
  
  
(a) Building image patch 
  
  
(b) Wavelet decomposition to level 4 of 
building patch in (a) 
  
  
  
Figure 2. (a) Sample image and (b) corresponding 
wavelet representation (not over-sampled) 
3. SUPPORT VECTOR MACHINES 
The Support Vector Machine (SVM) is based on the principles 
of structural risk minimization (Vapnik, 1995). It has the 
attractive property that it minimizes a bound on the 
generalisation error and is therefore not subject to problems of 
local minima that may occur with other classifiers such as 
multilayer perceptrons (MLP). 
Another property of the SVM is that its decision surface 
depends only on the inner product of the feature vectors. As a 
result, the inner product can be replaced by any symmetric 
positive-definite kernel (Cristianini & Shawe-Taylor, 2000). 
The use of a kernel function means that the mapping of the data 
into a higher dimensional feature space does not need to be 
determined as part of the solution, enabling the use of high 
dimensional space without addressing the mathematical 
complexity of such spaces. SVM's have been used successfully 
in applications for face detection (Osuna et. al., 1997), character 
recognition (Schólkopf, 1997; Boser et. al, 1992) and 
pedestrian detection (Papageorgiou et. al., 1998). 
4 TEST DATA 
A set of classification test data, in the form of square image 
patches, was extracted from three colour digitized aerial 
photographs. The photographs were originally acquired for a 
project over the city of Ballarat in Victoria. The photographs 
were recorded at a scale of 1:4000 and had been scanned on a 
photogrammetric scanner at a resolution of 15 microns. Each 
image patch was 256 by 256 pixels and contained either a single 
building or a non-building area of the image. Although some 
care was taken to centre the building within the image patch, the 
exact location of the building in the image patch varied. The 
orientation of the building within the image patches also varied. 
This lead to a broader representation of the building class than if 
the buildings were carefully aligned in each image patch but 
created a more difficult classification problem. 
The classification test was based on a balanced test set of 100 
building images and 100 non-building images. Image 
coefficients were extracted using the wavelet process described 
in 2.1 above. A public domain Support Vector Machine 
(Joachims, 1998) was used to classify the image patches into 
building or non-building categories. 
5. RESULTS 
The image patches used to train the SVM classifier using 
several different kernels including polynomial and sigmoidal 
kernels. However, the best results were obtained with a simple 
linear kernel with no bias. Of the two hundred image patches, 
only one patch was classified incorrectly. This was an image of 
a large swimming pool that was classified as a building. 
Although this result appeared to be very good, the confidence 
measures produced by the SVM training suggested a reliability 
of only 55%. This could be due to overfitting of the decision 
surface to the data. However, the reliability measures produced 
by the SVM are also known to be pessimistic (Joachims, 1998), 
due to the unbounded nature of the problem. To see if that was 
the case here, an extensive leave-one-out test was undertaken. 
This produced a revised reliability measure of 73%. Although 
the reliability estimate improved, this indicates there may still 
be some overfitting of the data. To test this further, 20 building 
image patches were withheld from the training data and the 
SVM was re-trained. The withheld patches were then classified 
by the new SVM. Of the 20 building patches, only 8 were 
classified as buildings. This result is similar to the original 
reliability estimate. In this case, the decision surface of the 
SVM is unlikely to generalize well to a broader set of data. This 
could be due to the small size of the training set. Further work is 
required to expand the size and scope of the training set to 
determine if a more generalized decision surface can be 
established. 
5.1 Other Considerations 
In applying multi-resolution analysis, an appropriate set of 
resolutions must be chosen for the task at hand. In this case, a 
choice must be made between minimizing the amount of data 
that is fed to the classifier and retaining enough information 
about the original image that a sensible classification is 
possible. In this example, wavelets with supports of 16 and 32 
pixels were used, resulting in image coefficients of 16 x 16 and 
8 x 8 for each of the horizontal, vertical and diagonal wavelet 
functions. 
Other resolutions were tried but at higher resolutions, the 
number of coefficients expands rapidly and there was no 
significant gain in the accuracy of the classification. At lower 
resolutions, too much information about the image was lost to 
enable definite classification. 
One limitation of this implementation is the size of the 
coefficient data set produced from the wavelet transform. As the 
wavelet transform is over-sampled, each image patch generates 
960 coefficients. The current algorithm makes no attempt to 
optimize the storage of these coefficients. 
Mac 
seve 
has | 
appl 
metr 
usefi 
auto 
The 
as it 
prod 
reca 
prob 
feat 
An i 
is t 
imag 
quit: 
Sugg 
man 
Alth 
mac 
to ic 
Ass 
patc 
the 
Ago 
Sup 
Geo 
Pho 
XX 
Bos 
algo 
ACI 
Pitts 
Can 
Det 
Inte 
Cris 
Sup, 
met 
Fiel 
Con 
Gro 
Mec 
the 
Ext 
Ima 
Bas 
Her 
attri 
Der 
Fed
1
2
...
65
66
67
68
69
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user