Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision", Graz, 2002 
  
Some authors propose the use of texture for the detection of 
regions with trees in urban environments, for example (Zhang, 
2001) who uses local directional variance with good results. 
The local variance was also used in (Straub et al., 2000) for the 
detection of vegetation areas in coarse scale. In (Baumgartner et 
al., 1997) the authors propose the use of Laws Filters (Haralick 
and Shapiro, 1992) for the detection of textured regions. These 
features are often used as an additional channel in the 
framework of a pixel per pixel classification, refer for example 
(Kunz and Vógtle, 1999), (Straub et al., 2000), (Zhang, 2001). 
Summarizing one can say, that different texture parameters were 
investigated for the automatic detection of vegetation in aerial 
or satellite imagery. But, *The texture discrimination techniques 
are for the most part ad hoc" (Haralick and Shapiro, 1992, 
p.453), which is perhaps true until today. Standardization may 
overcome this problem. Thus we have investigated the 
qualification of the MPEG-7 Homogeneous Texture Descriptor 
(HTD) (Man Ro et al, Kim 2001) for the detection of 
vegetation in high resolution aerial images. 
3. THE MPEG-7 HOMOGENEOUS TEXTURE 
DESCRIPTOR 
MPEG-7 is an ISO/IEC standard developed by MPEG (Moving 
Picture Expert Group) The formal name of MPEG-7 is 
*Multimedia Content Description Interface". The standard 
provides a set of standardized tools to describe multimedia 
content, Geographic Information Systems and Remote Sensing 
are mentioned as possible application domains. Low level 
features of images like texture and color are described in the 
part “MPEG-7 Visual”. Three texture descriptors are 
recommended, the HTD, the edge histogram descriptor (EHD), 
and the perceptual browsing descriptor (PDB). The HTD should 
allow to classify images with high precision (Wu et al., 2001). 
The detection of objects like “parking lots", or “vegetation 
patterns" is also directly mentioned in the standard (ISO/IEC, 
2001). The MPEG-7 Homogeneous Texture Descriptor (HDT) 
is described in detail in (Man Ro et al., 2001). In this section we 
give a short summary of the used filter bank, the extracted 
feature vector and the proposed measures of similarity. 
3.1 Extraction of Textural Features 
The extraction of features is done with a Gabor filter bank. In 
radial direction the feature channels are spaced with octave 
scale, center frequencies and octave bandwidths are given in 
Table 1. Thirty feature channels C; are defined for the features 
which are extracted with Gabor filters in six orientations 
Q = 0°, 30°, 60°, 90°, 120°, 150° with an angular bandwidth of 
30°. This frequency layout is motivated by the human visual 
system. It was confirmed by psychophysical experiments, that 
the brain decomposes the incoming signal into sub bands in 
frequency and orientation (Branden Lambrecht, 1996). 
  
Radial Index, r 0 1 2 3 4 
  
Center frequency ® 34 :3/8| 3/161 3/32| 3/64 
  
  
Octave Bandwidth % Va 1/8} 1/461 1/32 
  
  
  
  
  
  
  
  
Table 1: Parameters of Gabor Filter Bank in radial Direction 
  
Q 25-30 
© 13-18 
Figure 1: Frequency layout of the Gabor filter bank with ID's of 
feature channels C, depicted is the Gabor filter for 
feature channel Co. 
3.2 The Feature Vector 
The mean value fpc and standard deviation fsp of the original 
image, as well as the energies e; and their standard deviations d; 
of the Gabor filtered image constitute the feature vector 7D, as 
follows: 
TD = [foc fs» €1,.€3 €3, ..., €30 dj, d», ds, …d430] 
All 62 elements together are called the enhancement layer, the 
reduced feature vector without d; values is called base layer. 
The computation of the feature vector can be done in advance, 
and then the feature vector can be stored together with the 
image. The quantization of the TD values to 1 byte leads to a 
total length of the texture descriptor of 62 bytes for the 
enhancement layer, respectively 32 bytes for the base layer. If 
one uses tiles with a size of 128*128 pixel, and stores only the 
base layer of the feature vector, the amount of storage for the 
feature layer is then 1/512 of the uncompressed size of one 
image channel (1/264 for the enhancement layer). 
3.3 Measurement of Similarity 
The similarity d(R, J) between two images R and J, can be 
measured with the Euclidian Distance in feature space. Once the 
feature vector TD is computed, the following similarity 
measurements can be performed. In the following the 7Dp is 
the feature vector of the reference image A (in the domain of 
image retrieval the term query image is more usual), index j of 
TD, assigns the feature vector of another image J, and the index 
k marks the k-th element of the feature vector. 
d (R,J) = distance (TD , TD, ) (1) 
  
ARE rca 
The weighting factor w(k) of the k-th TD value and the 
normalization values @(Æ) depend on the used images. In 
(Man Ro et al, 2001) it is proposed to use values from a 
reference data base. 
A - 332
1
2
...
365
366
367
368
369
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user