FULLY AUTOMATED, HIGH-RESOLUTION CLASSIFICATION OF REMOTELY SENSED DIGITAL
MULTISPECTRAL RECORDINGS
by B.-S. Schulz
Institut für Angewandte Geodäsie, Frankfurt a.M., Germany, Com. III
ABSTRACT
The working program pre-set by the chairmen of WG III/3 has been worked on further rigorously as
regards topic 1 (Analysis of Multispectral Digital Recordings). The results obtained have been
condensed in a method of automatic classification. Thus, e.g. multispectral homogeneity and normal
distribution of grey values within so-called training areas ensure that theoretical preconditions of
the Maximum Likelihood Method (MLM), which was used for their discrimination, are met. From this a
method for searching training areas automatically has been developed. Reliability of classification
results depends decisively on that statistically equal training areas are grouped before classifi-
cation, and that statistically not equal but neither significantly different training areas are
grouped only after classification. The statistical test parameters needed for this purpose are
presented and their effects described, whereby confusion matrices and classification results serve as
examples.
Keywords: Classification, Feature Extraction, Landsat
1. INTRODUCTION
Expectations placed by the different users in
remote sensing focus mainly on aspects such as
accuracy of content and geometry as well as on
the fine structured variety of land use classi-
fications.
The confusion matrice still serves as a source
of information on the reliability of assignment
of re-classified training areas as well as -
inadmissibly - a criterion of discrimination in
the case of larger, classified areas. The
achievable accuracies documented in this way
give no rise to much confidence as to the
reliability of results. A hit rate of 60-70 %
must be considered as a poor result unworthy of
discussion and should be sufficient reason for
thinking about the causes. Item 1 of the working
programme of WG III/3 served that purpose.
1. Analysis of multispectral digital
recordings
1.1 Analysis of data with regard to systematic
and random errors and their effect on
classification
1.2 Possibilities of data preprocessing and
compression without loss of informtion
1.3 Statistical requirements on training areas
and their statistical analysis
1.4 Statistical analysis of clusters with
regard to separability of objects,
admissibility of their integration before
classification, and necessity for their
integration after classification.
1.5 Analysis of mainly used or self-developed
classifiers with regard to their
separation capability
1.6 Analysis and valuation of classification
algorithms and procedures as well as
of spectral resolution of different remote
sensing systems
1.7 Possibilities and limits of unsupervised
classification
1. FIRST RESULTS OBTAINED BY A NEW PROCEDURE
The causes of poor discriminations in the
classification as well as first results of new
methodological developments have been published
(Schulz, 1990). Considering that the acquisition
of objects and land uses can mainly be performed
indirectly via spectra, the spectral homogeneity
and normal distribution constitute important
factors, that means that, among others, neither
a variance to be pre-set may be exceeded nor may
there exist more than one frequency maximum.
This comparatively simple requirement is in most
cases very difficult to be met if one pre-de-
fines training areas externally, particularly in
instances where they are obviously inhomogeneous
in land use and hence also spectrally inhomo-
geneous, as e.g. in cases of sparsely built-up
areas, mixed forest, etc.
2. AUTOMATIC SEARCH AND GROUPING OF TRAINING
AREAS
The problem of training area definition
inadmissible in the sense as outlined above may
be solved by scanning the data without training
area-related a priori definitions, that means
automatically without operator or intepreter,
for those image sections which fulfill the a.m.
distribution criteria in all n spectral bands
within the range of a pre-set gliding working
matrice. In these instances of a successful
search for training areas their substitutional
parameters, i.e. mean value vector and covarance
matrice, are stored.
The training areas found in this way do not at
all contain à priori significantly different
spectral qualities of land uses. For this reason
it is important to check in the following every
training area by comparing it to each other one
for whether
- it is statistically equal to another one and
can thus be combined with it into a
training area already before classification
- it is significantly different from the othef
one and must hence enter into
classification as representing a completely
new type of land use or
- whether it is, in the sense as defined above,
neither equal with any other area nor
significantly different from it and must
therefore first be formally introduced into