PI-2-6
MSE(w y ) <MSE(h> 0 ) . (34)
However, it should be noted that the parameter y which minimizes
MSE depends on unknown parameter w .
5.2 Application of Tikhonov’s regularization to training
algorithm
By using Tikhonov’s regularization method, the generalization
problem of an LNN can be solved by optimizing function
E + y • z(w) analogously. We adopt the square of the Euclidean
norm of weights w ih , w k j as a smoothing function. Then, we can
obtain the modified error function as follows
E (y ) m \22^pj( x k> w )- d j( x k)] i
( HI _ J H _ \
+ 2 • (35)
/i-li'-l j-lh-l J
on information on the value of the true parameter. Thus, there
have been some methods proposed to select appropriate regulari
zation parameters. One of the most popular methods among them
is to determine the value of a regularization parameter by testing
how it performs on the validation data set.
6 CASE STUDY
In this chapter, we will design an LNN based on the suggested
procedure and apply it to a practical land cover classification
problem.
6.1 Study area and data used
The study area is located in Nagakute town, Aichi prefecture in
Japan. Airborne Multi-Spectral Scanner digital image data of
256 x 256 pixels were acquired with 12 bands. A pixel size is
6.25 by 6.25 m. We classified the area into seven classes based on
the land cover by different model sets and tested the possibilities
of practical application of the suggested techniques for generali
zation problem.
The rule of changing the value of weights in the back-propagation
algorithm is also modified as (Amerikian and Nishimura (1995)
and Mehrotra et «/.(1997))
"r-H'f+rvA Hfo ,
(36)
<"=< +^VAvv /y . ,
(37)
A dE( y) dE
Avv ih ~ . - . Y ‘ w ih »
dw ih dw ih
(38)
A dE( y) dE
Avv /y = a a Y ' w hj ’
dw hj dw hj
(39)
or (36), (37) and
dE k (y) dE k
* w ih- J - * Y’Wflk ;
dw ih dw ih
, (40)
dE k (y) c)E k
- , Y ‘ w hj .
dw hj dw hj
, (41)
where
E k(v) s ^l{>j( x k>' v )-dj(x k )} i
2 jm\
(HI , J H \
+ Y- + 2 •
U-1/-1 y=i/,=i I
(42)
A modified back-propagation algorithm based on Tikhonov’s
regularization for improving the generalization ability of an LNN
is called weight decay in some literature (Krogh and Hettz(1992)).
6.2 Experimental results
Figure 2 shows the LNN based remotely sensed image classifica
tion system with 7 nodes in the input layer and 7 nodes in the
output layer. The number of hidden layer nodes was chosen based
on AIC.
Neural Network j
Airbone MSS Image Input Layer Output Layer ^7dasses)°
Band 1
Band 2
Band 12
Hidden Layer
Forest
Ep} 2 1 Paddy
O 4 Bare soil
Uf ban
River
’O Pi 7 I Water
Figure 2 The LNN for remotely sensed images classification.
Each pixel consists of 12 spectra] measurements to be assigned to
one of seven classes. Training data of 1500 pixels and test data of
400 pixels were extracted from digital ground truth data with the
same pixel size (6.25 by 6.25 m) for all models.
Needless to say, an ordinary back-propagation algorithm is con
sidered to be a special type of algorithm when the penalty pa
rameter y is zero. If we adopt an optimal regularization parame
ter y, we obtain a better weight so that the expected residual sums
of squares might be less. However, as was explained in the previ
ous section, selecting the best regularization parameter depends
Starting with a model where the numbers of hidden nodes and
output nodes are both 7, eight competing size sets of LNN were
compared. In each competing size sets, pruning the connection
weights was done based on AIC. The results are shown in Figure
3. This indicates that AIC is minimized at the point where the
number of hidden nodes is 4 and the difference of AIC between