ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision", Graz, 2002
Some authors propose the use of texture for the detection of
regions with trees in urban environments, for example (Zhang,
2001) who uses local directional variance with good results.
The local variance was also used in (Straub et al., 2000) for the
detection of vegetation areas in coarse scale. In (Baumgartner et
al., 1997) the authors propose the use of Laws Filters (Haralick
and Shapiro, 1992) for the detection of textured regions. These
features are often used as an additional channel in the
framework of a pixel per pixel classification, refer for example
(Kunz and Vógtle, 1999), (Straub et al., 2000), (Zhang, 2001).
Summarizing one can say, that different texture parameters were
investigated for the automatic detection of vegetation in aerial
or satellite imagery. But, *The texture discrimination techniques
are for the most part ad hoc" (Haralick and Shapiro, 1992,
p.453), which is perhaps true until today. Standardization may
overcome this problem. Thus we have investigated the
qualification of the MPEG-7 Homogeneous Texture Descriptor
(HTD) (Man Ro et al, Kim 2001) for the detection of
vegetation in high resolution aerial images.
3. THE MPEG-7 HOMOGENEOUS TEXTURE
DESCRIPTOR
MPEG-7 is an ISO/IEC standard developed by MPEG (Moving
Picture Expert Group) The formal name of MPEG-7 is
*Multimedia Content Description Interface". The standard
provides a set of standardized tools to describe multimedia
content, Geographic Information Systems and Remote Sensing
are mentioned as possible application domains. Low level
features of images like texture and color are described in the
part “MPEG-7 Visual”. Three texture descriptors are
recommended, the HTD, the edge histogram descriptor (EHD),
and the perceptual browsing descriptor (PDB). The HTD should
allow to classify images with high precision (Wu et al., 2001).
The detection of objects like “parking lots", or “vegetation
patterns" is also directly mentioned in the standard (ISO/IEC,
2001). The MPEG-7 Homogeneous Texture Descriptor (HDT)
is described in detail in (Man Ro et al., 2001). In this section we
give a short summary of the used filter bank, the extracted
feature vector and the proposed measures of similarity.
3.1 Extraction of Textural Features
The extraction of features is done with a Gabor filter bank. In
radial direction the feature channels are spaced with octave
scale, center frequencies and octave bandwidths are given in
Table 1. Thirty feature channels C; are defined for the features
which are extracted with Gabor filters in six orientations
Q = 0°, 30°, 60°, 90°, 120°, 150° with an angular bandwidth of
30°. This frequency layout is motivated by the human visual
system. It was confirmed by psychophysical experiments, that
the brain decomposes the incoming signal into sub bands in
frequency and orientation (Branden Lambrecht, 1996).
Radial Index, r 0 1 2 3 4
Center frequency ® 34 :3/8| 3/161 3/32| 3/64
Octave Bandwidth % Va 1/8} 1/461 1/32
Table 1: Parameters of Gabor Filter Bank in radial Direction
Q 25-30
© 13-18
Figure 1: Frequency layout of the Gabor filter bank with ID's of
feature channels C, depicted is the Gabor filter for
feature channel Co.
3.2 The Feature Vector
The mean value fpc and standard deviation fsp of the original
image, as well as the energies e; and their standard deviations d;
of the Gabor filtered image constitute the feature vector 7D, as
follows:
TD = [foc fs» €1,.€3 €3, ..., €30 dj, d», ds, …d430]
All 62 elements together are called the enhancement layer, the
reduced feature vector without d; values is called base layer.
The computation of the feature vector can be done in advance,
and then the feature vector can be stored together with the
image. The quantization of the TD values to 1 byte leads to a
total length of the texture descriptor of 62 bytes for the
enhancement layer, respectively 32 bytes for the base layer. If
one uses tiles with a size of 128*128 pixel, and stores only the
base layer of the feature vector, the amount of storage for the
feature layer is then 1/512 of the uncompressed size of one
image channel (1/264 for the enhancement layer).
3.3 Measurement of Similarity
The similarity d(R, J) between two images R and J, can be
measured with the Euclidian Distance in feature space. Once the
feature vector TD is computed, the following similarity
measurements can be performed. In the following the 7Dp is
the feature vector of the reference image A (in the domain of
image retrieval the term query image is more usual), index j of
TD, assigns the feature vector of another image J, and the index
k marks the k-th element of the feature vector.
d (R,J) = distance (TD , TD, ) (1)
ARE rca
The weighting factor w(k) of the k-th TD value and the
normalization values @(Æ) depend on the used images. In
(Man Ro et al, 2001) it is proposed to use values from a
reference data base.
A - 332