Proceedings, XXth congress: Proceedings, XXth congress

altan, m. orhan
bul 2004 
inted the 
s pattern 
reaks for 
therefore 
VTION 
lonormal 
ct of all 
satisfy 
(0, 2m) 
96), that 
V have 
(1) 
wavelet. 
IR), that 
, So that, 
renerate 
(x) 
ction in 
v (x) 
rencrate 
(x) 
ction in 
7 (x) 
wavelet 
For the 
yyramid 
present 
the G, 
ent row 
) 
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV. Part B2. Istanbul 2004 
git s H GC? (4) 
g 71 
Un 
> 
mo. C 
In practice, in the process of texture and structural feature 
extraction in a digital image, the operator G.H, in equation 3 
smoothes the column vector and finds such differences as exist 
between the objects in the rows. Likewise, qM can detect 
the change in the edge of objects in the horizontal direction and, 
the operator in equation 4 can detect the change in the edge of 
objects in the vertical direction, while the operator in equation 5 
can detect any change in the diagonal direction. 
To extract texture and structural features of ground objects in an 
image with S. Mallat 's pyramid decomposing algorithm, a 
wavelet base is selected along with the number of levels (N) to 
be decomposed. In this particular case, Daubechies's 
orthonormal wavelet (Daubechies 1998) was adopted as the 
wavelet base. The reasons for this were as follows: 
(a) Orthonormal, which has direct ratio to the size 
of the support set (2N). 
(b) Continuous degenerate matrix. 
(c) The smoothness and large degenerate matrix 
means that it is better at differentiating 
frequencies. At the same time, similar to a low- 
pass filter, it can keep the low frequency 
component of the original image without 
obvious blur effects. 
From the point of view of filtering, it is preferable to maximize 
N, but in image decomposing, this will result in more boundary 
effects -- the higher the decomposing level, the greater the 
boundary effects on the image. Moreover, computation time is 
expanded as NV’. Through experimentation, N equal to three was 
selected where the results were better. 
The application of a two dimensional Discrete Wavelet 
Transform (DWT) expands the image / into a sum of four 
components at N resolution levels. In essence, the wavelet 
transform operation is separable and consists of two one 
dimensional operations along the rows and the columns of / : 
f . 
' row in 7 23 ID 
i. From the first row to the m' 
DWT is performed to generate 7 C9 H, and 
IDG 
ii. From the first column to the m” column in 
I 6G) H, and / C9 G, a ID DWT is performed to 
generate four components 
ir e(G,.H.)1 e (H,.G )19(G,.G,)] 
and / &(H,.H.). 
The original image is divided into four components after 2D 
DWT. These comprise one low frequency component 
I(1), = {I © (H, T y and three high frequency 
components: 
(1), - (18 (G,. H1 8 (H,.G.) 1 6 (G,.G, )). 
~y . . 
The Le(G.H.) component contains horizontal edge 
information, the lIQ(H.G.) contains vertical edge 
information, and the / &(G, ; 6G") contains diagonal edge 
information. Steps | and 2 are repeated with the low frequency 
image Ha, to produce four components at each new level. 
So, N level pyramid decomposing will result in 3N+/ 
components. 
At cach level, there is one low frequency component and three 
high frequency components. The three high frequency 
components contain textural and structural information in the 
horizontal, vertical and diagonal directions respectively at each 
level of decomposition. Recognition features were constructed 
from the components as follows: the original digital image was 
decomposed into one low frequency component (E1) and three 
high frequency components denoted as E2, E3 and E4. After 
decomposing the low frequency component at level 1 another 
set of three high frequency components were created at level 2, 
respectively denoted as E5, E6 and E7 along with one low 
frequency component (now El). Finally, DWT was performed 
to El at level 2 generating four components at level 3 (El, and 
three high frequency components respectively denoted as ES, 
E9 and E10). Making use of the 9 high frequency components, 
recognition features in the diagonal, vertical and horizontal 
were integrated as follows. 
AdzE(EA:LS) (6) 
À 2 = E7/(E5+ E6) (7) 
A3 
E10/(E8+ E9) (8) 
where A1, A2 and A3 represent, at each decomposing level, the 
ratio of the energy of the edge of ground objects in the diagonal 
direction to the sum of the energy of the edge of ground objects 
in the horizontal and vertical at that scale. These three 
recognition features are invariable throughout orientations of 
the image through 90, 180, and 270 degrees. 
The Improved Backpropagation Neural Network Classifier 
An Artificial Neural Network (ANN), also referred to as a 
Neural Network is ‘an interconnected assembly of simple 
processing elements, units or nodes, whose functionality is 
loosely based on the animal neuron. The processing ability of 
the network is stored in the inter-unit connection strengths, or 
weights, obtained by a process of adaptation to, or learning 
from, a set of training patterns’ (Gurney 1997). Neural 
Networks are often used for cluster analysis and image 
classification. 
ANN models include Back Propagation, Counter Propagation, 
Hopfield Nets, Adaptive Resonance Theory (ART) nets, 
Kohonen Sclf-Organization Feature Maps (SOFM) etc. In this 
study, the Feedforward multi-layer network based Back 
Propagation model (BP) was adopted. The BP model is 
applicable to a wide class of problems (Paola and Schowengerdt 
1995). In the BP model, the training algorithm to be developed 
is based on Back Propagation (Rumelhart et al. 1986) in which 
the signaling errors go backwards from output to input nodes 
through nets. In the training process cach iteration is divided 
into two stages after the image data are input to the input layer. 
The outline of the BP algorithm consists of the following steps:
1
2
...
192
193
194
195
196
...
878
879
Full text: Proceedings, XXth congress (Part 2)

Access restriction

Copyright

Note to user