where y,is the actual output of the node À, and f is the
activation function. The most commonly used activation
function ( f ) is the sigmoid function. The sigmoid function
most often used for ANNs (in this study also) is the logistic
function such as below.
1 ;
= u,) —————— 5)
yı = od 1+exp(-u,) :
The measure of error is the sum squared error (SEE) of learning
function that is calculated by equation (6), which should be
minimized.
SSE- Y 3 (d, - y 6)
Dick
where d pk and y pk are the teaching and actual outputs of
output node Kk on pattern D respectively. The back
propagation algorithm attempts to minimize iteratively the SSE.
2.3 The Method
The ANN used in present study, is SNNS (Stuttgart Neural
Network Simulator) developed by university of Stuttgart
(Martin et. al 1998). The Feed-Forward ANN with back-
propagation learning algorithm, SEE for error estimation and
randomize weights function (The weights and biases initialize
with distributed random values) are used. To investigate the
best architecture of ANN for the development of the
classification algorithm, the 400 patterns have been randomly
divided into two parts, for the training, and testing. In all data-
sets, as indicated previously, the inputs T, are normalized
between 0 and 1 using equation (2) and for both training and
testing pattern files the teaching output are represented by 0.0
and 0.9. If the output value of pattern D for class l is2 0.5,
the pattern is considered as class / otherwise it is considered as
not classified.
The network training consists of 20000 iterations through the
training patterns. At the beginning of training, all weights and
biases are set to independent random values between + 1 . This
procedure is repeated several times by assuming a different set
of random values of weights and biases for initializing the
network and different learning parameter (between 0.2-0.9).
Note that in these runs, the number of nodes in each hidden
layer is kept constant. The objective of that is to reach the
global minimum or deepest possible local minimum of the error
surface. To avoid over-fitting and arrive at the optimal network
architecture, this procedure is repeated for the different
architectures of network. The architecture, which yields the
lowest SSE magnitude on training and testing datasets, is
considered to be optimal for the chosen number of hidden
nodes and hidden layers. In our case, the optimal network
configuration, which error is the globally smallest, had 2 hidden
IAPRS & SIS, Vol.34, Part 7, “Resource and Environmental Monitoring”, Hyderabad, India,2002
layers with 8 nodes in each layer (5-8-8-8). The program was
run for 20000 iterations and learning has been done by all 200
training patterns. For over all performance, it is tested with 200
test patterns, which network never seen them before. The SSE,
in training case, drops fairly rapidly from first few hundred
iterations, and then the performance improve more slowly. It
gives some indication of iteration effect on SSE. The SSE for
all 200 testing patterns is 43.4.
2.4 Results
The brightness temperatures at five frequencies of AMSU-B are
selected as input for discriminate and classification of the eight
classes of weather features, using the ANN. The test performed
using the ANN classifier resulted in 162 of 200, samples being
correctly classified. The accuracy of classification and
classification of the test patterns as a function of classes, i
details, is presented in table 1 and Fig 2. :
A significant portion of the improper classifications occur
between class3 (Rnf) and class 4 (Snf). For example, from 4
improper class patterns in class 4, all are being improper
classified as class 3. The improper classification may be due to
large similarity between rainfall and snowfall classes in the
microwave region. The most of improper classified samples in
class 2 are also being misclassified as class 3. It can be
Table 1: the classes (col. 1), and corresponding test patterns of
each class per AMSU-B channel (col.2), number of correct,
improper and not classified patterns (col.3-5).
Classes “TTP! Cc NPC? NC*
1. Trs 25 (1-26) 25 0 0
2.Hvr | 25 (26-51) 14 6 5
3.Rnf | 25(51-76) 19 3 3
4.Snf | 25(76-101) 14 4 7
5.CII | 25 (101-126) 16 3 6
6..Crl-. 25(126-151) 24 1 0
7.Cls d 25 051-176) 25 0 0
8.Crs | 25 (176-201) 25 0 0
Total 200 162 (81%) | 17(8.5%) | 21(10.5%)
"TTP = total test patterns, ^ CC = Correct Classified, * NPC =
not proper Classified, ^ NC - not Classified, and ? ' the pattern
numbers are indicated in the bracket (see Fig 2).
seen from Fig 2 that more improper classification appeared for
classes of 2 to 5. Not classified patterns, in Fig 2, are also
presented as class nine.
Otherwise of our ANN results could only be compared and
verified using other (IMO and meteorological observations).
There is no other model available for comparison. Results in
table 1 and Fig 2, however indicate that the only five
frequencies data of AMSU-B, is not able to give full
justification directly for data analysis for all the reported eight
IMO classes. There are probably some other input parameters,
which are essential for the purpose (probably like ground
temperature).
Fig.
Ac
pro
Sci
faci
ack
Bar
inr
Bar
Net
Vo
Bu
Ret
SS]
101
Chi
19%
Mi
Oce