tof
cally
each
orm,
two
0). a
siven
esent
, and
| and
ected
ag all
id its
s, and
This
onger
up of
19 the
iginal
These
] over
ce the
ors in
| input
n, this
M.
ing by
(CM).
erated
ask of
in the
vill be
n. The
KCM has some important characteristics that enable these
tasks:
e As the SOM performs a clustering on the training patterns,
it’s possible to visualize the spectral classes present in the
original image through the clusters obtained in the KCM.
e The SOM property of preserving the topological relationships
among the input data vectors is reflected in the KCM property
of preserving these relationships among the clusters so
obtained, in terms of distances among them. Clusters that are
close to each other in KCM represent land cover classes which
posses similar spectral features;
e The SOM property of preserving the probability distributions
found in the input data can be verified in the KCM, where
higher frequency spectral classes in the input data will be
mapped onto bigger regions in the KCM.
Therefore, the spectral classes and their samples, which will be
used in the classification phase, are selected from the KCM
and not directly from the original image as is usually done.
2.3 Parallel Implementation of SOM
The inherent parallelism of ANN is well known. Efficient
parallel implementation of neural networks both in hardware
and in software is an active research field.
In this module for feature extraction the parallel
implementation of SOM was realized by software, aiming at
improving the performance in terms of the training time of
SOM, using a tool developed by the Oak Ridge National
Laboratory, the Parallel Virtual Machine (PVM).
PVM is a software system that enables a collection of
heterogeneous computers to be used as a coherent and flexible
concurrent computational resource. The individual computers
may be shared- or local-memory multiprocessors, vector
supercomputers, specialized graphics engines, or scalar
workstations, that may be interconnected by a variety of
networks, such as Ethernet, FDDI. s
User programs written in C, C++ or Fortran access PVM
through library routines. Daemon programs provide
communication and process control between computers.
For SOM, the basic idea for paralleling the training algorithm
was to allocate sefs of neurons from the SOM to the
processors, distributing the training patterns among them, so as
to reduce the global computational time.
A comparison between the performance of the sequential and
the parallel training algorithm of SOM is shown in section 4.
3. MLP FOR CLASSIFICATION
Having selected the desired classes and their correspondent
samples from KCM, the objective of the second phase in our
proposed system is to perform the final classification of the
image using a Multilayer Perceptron (MLP) network.
119
MLP belongs to the class of feedforward neural networks,
consisting of a number of neurons which are connected by
weighted links. The units are organized in several layers,
namely an input layer, one or more hidden layers, and an
output layer. The input layer receives an external external
activation vector, and passes it via weighted connections to the
units in the first hidden layer. These compute their activations
and pass them to neurons in succeeding layers.
The training of the MLP network in performed in a supervised
way, where the objective is to tune the weights in the network
such that the network performs a desired mapping of input to
output activations.
The MLP network in our system has one hidden layer (fig.3).
The number of neurons per layer varies according to the
number of classes and to the size of selected samples to
perform the training.
Output layer
Hidden layer
Input layer
Figure 3: MLP network.
3.1 Training algorithm for MLP
Several adaptive learning algorithms for MLP neural networks
have recently been discovered. Many of these algorithms are
based on the gradient descent algorithm well know in
optimization theory. They usually have poor convergence rate
and depend on parameters which have to be specified by the
user, as no theoretical basis for choosing them exists. The
values of these parameters are often crucial for the success of
the algorithm. An example is the standard backpropagation
algorithm (Rumelhart et al 1986), which often behaves very
badly on large-scale problems and whose success depends on
user dependent parameters like learning rate and momentum
constant (Moller 1993), which is often the case with RS
applications, that normally handle large and full of details
images.
In this search for alternatives to the poor performance of
standard backpropagation, normally pointed out as the main
drawback to a broader utilization in RS image classification,
this work used an advanced training algorithm for the MLP
network.
The MLP learning algorithm used here is an improved version
of the Scaled Conjugate Gradient (SCG) algorithm presented
in (Moller 1993).
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B2. Vienna 1996