GENERALIZATION TECHNIQUES FOR LAYERED NEURAL NETWORKS
IN THE CLASSIFICATION OF REMOTELY SENSED IMAGES
Eihan SHIMIZU and Morito TSUTSUMI
Department of Civil Engineering
University of Tokyo
Japan
shimizu@planner.t.u-tokyo.ac.jp , tsutsumi@planner.t.u-tokyo.ac.jp
Le Van TRUNG
Department of Civil Engineering
The National University of Hochiminh City
Vietnam
trung.lv@ut-hcmc.cinetvn.com
KEY WORDS: Layered neural network, generalization, Akaike’s Information Criterion, Tikhonov’s regularization.
ABSTRACT
In recent years, researchers have paid a lot of attention to Layered Neural Networks (LNNs) as a non-parametric approach for the classi
fication of remotely sensed images. This paper focuses on the generalization capability of LNNs, that is, how well an LNN performs with
unknown data. First, we clarify its description from the point of view of information statistics. With this discussion, we provide a fea
sible technique to design the LNN in consideration of its generalization capability. Finally, we apply the proposed technique to a practi
cal land cover classification using remotely sensed images, and demonstrate its potential.
1 INTRODUCTION
Among supervised classification methods for remotely sensed
data, Maximum Likelihood Classification (MLC) is presently the
most widely known and utilized (e.g. Curran and Hay (1986),
Yool et al. (1986)). MLC is often used as a standard classification
routine against which other classification algorithms are com
pared. This popularity is due to the fact that MLC is the optimal
classifier in the sense of minimizing Bayesian error. However,
MLC is a parametric classification method where the underlying
probability density function must be assumed a priori. We may
obtain a poor MLC performance if the true probability density
function is different from what is assumed in the model. In recent
years, researchers have attempted to provide non-parametric
classification methods to overcome this disadvantage of MLC,
and Layered Neural Networks (LNNs) have been proposed as
suitable for the efficient classification of remotely sensed images.
When using an LNN classifier, however, users have often been
faced with a generalization problem; generalization is concerned
with how well an LNN model performs with input on which the
model has not been previously trained. That is, an LNN classifier
usually performs well on a set of training data, but it may not
guarantee good generalization over all unknown data during the
actual classification process.
This paper discusses LNN classifier generalization, a controversi
al and often vague term in the neural network literature (Wan
(1990)), and clarifies its description. We introduce some tech
niques for generalization based on Akaike’s Information Criterion
and provide a feasible technique to design an LNN in considera
tion of its generalization capability. Finally, we apply the pro
posed technique to a practical land cover classification using
remotely sensed images, and demonstrate its potential.
2 BASIC FORMULATION OF THE LAYERED NEURAL
NETWORK CLASSIFIER
It has been proved that a three-layered neural network, when the
appropriate number of nodes are set in the hidden layer and the
sigmoidal activation function is used in the hidden and output
nodes, can approximate any continuous mapping (Gallant and
White (1988) and Funahashi (1989)). Therefore, in this study, we
only focus on three-layered neural networks, as shown in Figure
1.
Input Layer Hidden Layer Output Layer
Figure 1 Architecture of three-layered neural network.
PI-2-1