Istanbul 2004
r function E.
(16)
1e arget value.
e number of
sholds.
d 5; between
(m) «a»
(18)
ng rate, AW,
/, and, 7 is à
vious weight
resent model
threshold 8;
Jj (m) (19)
(20)
resenting the
ange of the
ich the error
direction. to
inected with
and (19). It
value of the
rations times
g process’ is
rror function
put to output.
- BP. network
ou can then
nple gradient
and the local
t converge to
ings and lots
arning cffect.
| to globally
rward by S.
ie solution of
of energetics
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV. Part B2. Istanbul 2004
statistics,and try to simulate the heat object annealing so that we
can find the global optimum solution.
Suppose S-([S,S,....S, } is the set of all the possible
combinations(or states).C:S — R is the non-minus objective
function ,and so C(S) Z0 means the cost of the solution is S;. It
is clear that the optimizing combination can be described
formly with finding Ses , so that
CS )sminC(S3S es
Simulated Annealing processes mostly as follows:
Procedure SA. 1) /* 1, is initial state, T, is the
initial value of control parameter, C = C(S.) x)
nS S, X z(
(2) Repeat
(3) Repeat
( S, :— Generate( S);
(5) EC. sc Then S S.
/
/*S 1s current state*/
(6) Else If Accept( j, S) Then S. = S, ;
(7) Until 'inner-loop stop criterion' /* “Inner-loop stop
criterion" means the number of iterations of the SA in the
temperature T */
(0 T,,, - Update(T, ); k < k +1; /* Tthe velocity
of temperature's decline at a time with function update(T,) */
(9) Until “final stop criterion’ /* the finish of SA */
In Above algorithm, Generate(S) in the step(4) means generate
the next state S; at random from N. If C X C .then accept j
as the new current state,otherwise only accept j as the new
current state with some probability .All that is the function of
Accept(j.S).
Usually function Accept processes as follows.
Procedure Accept( j,S) f*in the
step(6),only
when C. c ‚call Accept */
(I) If exp [^ a AT, | » random(0,1)
(2) Then Accept:=True
Else Accept:=False
The aim of using Simulated Annealing algorithm is to get
globally optimize. In the errors arc reducing process,
disturbance at random in some degree can get over the
restriction of the local minima, and ensure the system keep
away form disturbing when it converges to global optimum
solution.This is just the problem which Simulated Annealing
algorithm has settled.
For aim of globally optimize, the following function specifies
the random disturbing:
Random(—1,+1) | (Q1
W,=W, x| 1+
loop _ time
Where Loop. time is the number of iterations, Random(-1,+1)
initialized as randomized real numbers within the range —1 to
+1 .From the above state we can see the Simulated Annealing
algorithm as the gradient descent with noises , and when the
temperature which identify the noise intensity is O(the number
of iterations is infinity ),it is just the gradient descent.
4. APPLICATION TO SAR IMAGE
Image classification belongs to the division of patterns in eigen
space. If it is supposed that existing samples xl, x2, x3, x4, ... ,
xn, in an image belong to certain categories Cl, C2, C3,
C4, ....,Cm , (mcn), it is possible to select n samples to extract
feature of each ground targets. The goal then of establishing
supervised samples is to make use of multispectral features and
couple them with those of texture and structural features and
use them for training the BP nets. In preparation for the
classification of the entire 224 by 224 scene, six input fields
were used which comprised the training pixels — the three
original channels (polarization combinations L-HH, L-HV and
C-HV) and the three energy components ( 1, 2 and 3)as
derived from the wavelet decomposition. These are referred to
as the target samples. Thus, the BP nets have six nodes in the
input layer and three nodes comprised the output layer based
upon the desired classification (poplar, bushes and background).
Broadly speaking, the number of nodes in the hidden layer is
arbitrary although general guidelines exist e.g. (Lippmann,
1987). In general, the more nodes in the hidden layer, the better
the result of image classification but it takes a longer time for
the network to learn the necessary knowledge for the
classification and often results in a reduction of the network’s
ability to generalize. The problem is therefore achieving a
balance between accuracy and the time required for training.
Through experimentation, four nodes were defined for the
hidden layer since this was found to generate an optimal
classification. In order to train the BP nets with the target
samples, the input data was rescaled to comply within the limits
of the Sigmoid activation function and set to 0.9 and 0.1
respectively. Table | shows the possible responses of the output
layer processing element. Data for training the network was
accomplished by defining a 10 10 pixel window (100 samples
of input data of each type) from the first three channels
(polarization combinations L-HH, L-HV and C-HV). For the
second three channels, the decomposed elements (1, 2 and 3)
the window must contain sufficient resolution to preserve the
essential information: considering the characteristic of texture
and structure in high-resolution SAR images and the
requirement of three level DWT, a window size of 32 32 pixels
was adopted.
Class Target Output
O, O, O;
Poplar Trees 0.9 0.1 0.1
Bushes 0.1 0.9 0.1
Background 0.1 0.1 0.9
Table I. Response of the output layer processing element
In the training process, an iteration is divided into two stages
after the data are input into the input layer. First, the vector of
the hidden-layer neuron 1s computed by the Sigmoid activation
function: then the vector of the output-layer neuron is computed
by the Sigmoid activation function. Second, the error between
the observed output and the desired output is calculated at the