585
OPTIMIZATION OF UNSUPERVISED LEARNING FOR HIGH RESOLUTION
REMOTE SENSING DATA
SHIMODA,HARUHISA; MATSUMAE,YOSHIAKI; NATENOPPARATH,AMNARJ;
FUKUE,KIYONARI; SAKATA,TOSHIBUMI
Tokai University Research & Information Center, Japan
ABSTRACT
Supervised learning methods have been widely used for land cover classification of
satellite imagery. However, these methods have two big problems. First, training areas
are selected arbitrary by operators though statistical theory requires random
sampling. Second, it is very difficult for high resolution images such as Landsat TM
data to extract sufficient number of training classes which are composed of only
spectrally pure objects.
Unsupervised learning with the aid of clustering can solve above problems. However, we
do not have sufficient information to use clustering efficiently for high resolution
data, i.e. (1) how many number of sampling data is necessary, (2) how many number of
clusters should be generated. In this research, in order to answer to these questions,
classification experiments of Landsat TM data were conducted.
From the experiments, following results were obtained. (1) There exists the optimal
number of clusters according to the sample size. (2) The method used to assign
categories to each cluster dominates the classification accuracy.
KEY WORDS:Clustering, Unsupervised learning. High resolution data
1 INTRODUCTION
With the launch of second generation high
resolution sensors like Thematic
Mapper(TM) and HRV, many kinds of
researches have been done to certificate
the capability of these sensors for land
use classification. Most of the results
of these studies have shown that
classification accuracies using these
sensors are not so high as expected when
applying conventional supervised maximum
likelihood classifier using only spectral
information. These results have made many
researchers to study spatial features
like textures or more sophisticated
classifier like expert systems,
Dempster-Shafer rule or fuzzy classi
fiers.
In addition, those results also have
shown the limitations of supervised
learning system. As well known,
supervised training area selections
cannot be assured as random samples on
which all the statistical method are
based upon. This problem has not been
emphasized in case of treating low
resolution images like MSS. It is mainly
because the image itself has been
composed of rather homogeneous areas made
from averaging process.
In the case of high resolution sensors,
this problem become hi-lightened. In many
land use classification studies,
estimated classification accuracies
calculated from confusion matrix of
training data has been very high (usually
90 to 98%) while accuracies estimated
from independent samples were very low
(typically 60 to 70%). These results
apparently show the fact that training
data have not actually represented the
statistics of their populations.
Furthermore, in order to obtain high
classification accuracies, an operator
should select more than 50 classification
classes for high resolution sensor data.
From the stand point of operational image
processing, this tendency that number of
training classes are largely increasing
will make the process almost impossible
as a matter of fact.
From the above reasons, unsupervised
learning process or actually speaking,
clusterings come to very important tool
for land use classification of high
resolution data. However, most of the
studies on clusterings were mainly
concerned about low resolution data like
MSS, and optimal conditions or at least
the least condition using clustering for
high resolution sensor data are not well
known. The purpose of this study is to
obtain fundamental knowledge about the
nature of clustering for high resolution
data, and clarify the influence of
sampling data size and number of clusters
to the classification accuracy.
2 DATA USED IN THE EXPERIMENTS
2.1 Image Data
Image data used in the experiments are
shown in Fig.l and its specifications are
shown below.
sensor : Landsat TM
date : 1984/Nov./4
path-row : 107-35
area : Hiratuka area Japan
pixel size : 25m
image size : 512 x 480