Kaichang Di
and spatial relation. For example, overlay analysis is needed to know which height zone an object falling into. This is a
step of feature selection, which is a characteristic of spatial data mining.
Figure 1 is the flow diagram of inductive learning in spatial databases. Generally, learning samples are selected
randomly from spatial database. When data storage is not very large, we can chose the whole data as learning data.
After determining learning granularity and attributes, the learning data are organized to a tabular form as the input to
C5.0 algorithm. C5.0 generates two kinds of outputs: decision tree and production rules. We chose production rules as
the outputs because they are easy to understand and use.
Select learning samples,
GIS database Determine learning
granulatity and attribute
Knowledge Generate production Learning
base rules data
(uem vo ou
Figure 1. Flow diagram of inductive learning in spatial database
3 REMOTE SENSING IMAGE CLASSIFICATION BASED ON INDUCTIVE LEARNING
In the field of remote sensing, Bayes classification (or maximum likelihood classification) is most widely used. It can
obtain minimum classification error under the assumption that the spectral data of each class is normally distributed.
Generally, there is much spectral confusion between classes. That is, same class with different spectral and different
class with the same spectral. The Bayes method itself can not solve the problem of spectral confusion. And because of
the requirement of statistical distribution, the auxiliary data can not be incorporated in Bayes classification.
For most multi-spectral remote sensing data, Bayes method classifies the coarse classes correctly, such as water,
residential area, green patches, etc. But usually more detailed classification is required in land use classification in
China. For example, water should be subdivided into river, lake, reservoir and pond; green patch should be subdivided
into vegetable field, garden, forest etc. These involve much spectral confusion. In order to subdivide water, shape
information and spatial association knowledge should be used. In order to subdivide green patches, spatial distribution
and also the slight difference should be used. In the following experiment, two kinds of knowledge are discovered from
land use and elevation data, which are applied to subdivide water and green patches respectively.
Pixel granularity is adopted for learning knowledge to subdivide green patches. We propose an approach to combine
inductive learning with Bayes classification method, which selects class probability of Bayes classification as learning
attributes. Firstly, the image are classified by Bayes method, the probabilities of each pixel to every classes are retained.
Then inductive learning is conducted taking probability values, location and elevation as the learning attributes. Since
the probability is derived from the spectral information of a pixel and the statistical information of a class, learning with
probability values makes use of the two kinds of information simultaneously. Comparative experiments show that using
probability values generates more accurate learning results than using the pixel values directly. It indicates that this
approach of combining inductive learning and Bayes method is effective.
Polygon granularity is adopted to subdivide waters. Knowledge about general geometric features and spatial distribution
patterns are discovered from polygons of different waters. Before using the knowledge, the remote sensing image is
classified first, the water areas in the classification image are converted form pixels to polygon by raster to vector
conversion and then the location and shape features of these polygons are calculated. Finally, the polygons are
subdivided into river, lake, reservoir and pond by deductive reasoning based on the knowledge. Here the combination of
inductive learning and Bayes classification is in a loose manner.
Figure 2 shows the diagram of remote sensing image classification with inductive learning. GIS data are used in training
area selection for Bayes classification, generating learning data of two granularities, test area selection for classification
accuracy evaluation. And also the GCPs for image rectification are chosen from GIS data. Therefor, GIS plays
important roles in remote sensing image classification from the beginning to the end.
240 International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000.