bul 2004
inted the
s pattern
reaks for
therefore
VTION
lonormal
ct of all
satisfy
(0, 2m)
96), that
V have
(1)
wavelet.
IR), that
, So that,
renerate
(x)
ction in
v (x)
rencrate
(x)
ction in
7 (x)
wavelet
For the
yyramid
present
the G,
ent row
)
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV. Part B2. Istanbul 2004
git s H GC? (4)
g 71
Un
>
mo. C
In practice, in the process of texture and structural feature
extraction in a digital image, the operator G.H, in equation 3
smoothes the column vector and finds such differences as exist
between the objects in the rows. Likewise, qM can detect
the change in the edge of objects in the horizontal direction and,
the operator in equation 4 can detect the change in the edge of
objects in the vertical direction, while the operator in equation 5
can detect any change in the diagonal direction.
To extract texture and structural features of ground objects in an
image with S. Mallat 's pyramid decomposing algorithm, a
wavelet base is selected along with the number of levels (N) to
be decomposed. In this particular case, Daubechies's
orthonormal wavelet (Daubechies 1998) was adopted as the
wavelet base. The reasons for this were as follows:
(a) Orthonormal, which has direct ratio to the size
of the support set (2N).
(b) Continuous degenerate matrix.
(c) The smoothness and large degenerate matrix
means that it is better at differentiating
frequencies. At the same time, similar to a low-
pass filter, it can keep the low frequency
component of the original image without
obvious blur effects.
From the point of view of filtering, it is preferable to maximize
N, but in image decomposing, this will result in more boundary
effects -- the higher the decomposing level, the greater the
boundary effects on the image. Moreover, computation time is
expanded as NV’. Through experimentation, N equal to three was
selected where the results were better.
The application of a two dimensional Discrete Wavelet
Transform (DWT) expands the image / into a sum of four
components at N resolution levels. In essence, the wavelet
transform operation is separable and consists of two one
dimensional operations along the rows and the columns of / :
f .
' row in 7 23 ID
i. From the first row to the m'
DWT is performed to generate 7 C9 H, and
IDG
ii. From the first column to the m” column in
I 6G) H, and / C9 G, a ID DWT is performed to
generate four components
ir e(G,.H.)1 e (H,.G )19(G,.G,)]
and / &(H,.H.).
The original image is divided into four components after 2D
DWT. These comprise one low frequency component
I(1), = {I © (H, T y and three high frequency
components:
(1), - (18 (G,. H1 8 (H,.G.) 1 6 (G,.G, )).
~y . .
The Le(G.H.) component contains horizontal edge
information, the lIQ(H.G.) contains vertical edge
information, and the / &(G, ; 6G") contains diagonal edge
information. Steps | and 2 are repeated with the low frequency
image Ha, to produce four components at each new level.
So, N level pyramid decomposing will result in 3N+/
components.
At cach level, there is one low frequency component and three
high frequency components. The three high frequency
components contain textural and structural information in the
horizontal, vertical and diagonal directions respectively at each
level of decomposition. Recognition features were constructed
from the components as follows: the original digital image was
decomposed into one low frequency component (E1) and three
high frequency components denoted as E2, E3 and E4. After
decomposing the low frequency component at level 1 another
set of three high frequency components were created at level 2,
respectively denoted as E5, E6 and E7 along with one low
frequency component (now El). Finally, DWT was performed
to El at level 2 generating four components at level 3 (El, and
three high frequency components respectively denoted as ES,
E9 and E10). Making use of the 9 high frequency components,
recognition features in the diagonal, vertical and horizontal
were integrated as follows.
AdzE(EA:LS) (6)
À 2 = E7/(E5+ E6) (7)
A3
E10/(E8+ E9) (8)
where A1, A2 and A3 represent, at each decomposing level, the
ratio of the energy of the edge of ground objects in the diagonal
direction to the sum of the energy of the edge of ground objects
in the horizontal and vertical at that scale. These three
recognition features are invariable throughout orientations of
the image through 90, 180, and 270 degrees.
The Improved Backpropagation Neural Network Classifier
An Artificial Neural Network (ANN), also referred to as a
Neural Network is ‘an interconnected assembly of simple
processing elements, units or nodes, whose functionality is
loosely based on the animal neuron. The processing ability of
the network is stored in the inter-unit connection strengths, or
weights, obtained by a process of adaptation to, or learning
from, a set of training patterns’ (Gurney 1997). Neural
Networks are often used for cluster analysis and image
classification.
ANN models include Back Propagation, Counter Propagation,
Hopfield Nets, Adaptive Resonance Theory (ART) nets,
Kohonen Sclf-Organization Feature Maps (SOFM) etc. In this
study, the Feedforward multi-layer network based Back
Propagation model (BP) was adopted. The BP model is
applicable to a wide class of problems (Paola and Schowengerdt
1995). In the BP model, the training algorithm to be developed
is based on Back Propagation (Rumelhart et al. 1986) in which
the signaling errors go backwards from output to input nodes
through nets. In the training process cach iteration is divided
into two stages after the image data are input to the input layer.
The outline of the BP algorithm consists of the following steps: