to
1S
n
Jr
of
€
n
Kaichang Di
GIS database A
Y
Bayes classification
ita classification esa”
Polygon granularity Polygon granularity
learning data t. learning data m
Inductive learning in Inductive learning in
polygon granularity pixel granularity
Deductive pp
Knowledge reasoning
base
Yv
7 Test area > i — Final classification su”
Evaluation of classification accuracy
Figure 2. Flow diagram of remote sensing image classification with inductive learning
The knowledge discovered by C5.0 algorithm is a group of classification rules and a default class, and with each rule,
there is a confidence value (between 0 and 1). As shown in figure 2, the final classification results are obtained by
postprocessing of the initial classification results by deductive reasoning. The attributes for deductive learning are the
same as that in inductive learning except the class label attribute. The following strategies are adopted in deductive
reasoning: (1) If only one rule is activated, which means the attribute values match the conditions of this rule, let the
final class be the same as this rule; (2) If several rules are activated, let the final class be the same as the rule with the
maximum confidence; (3) If several rules are activated and the confidence values are the same, then let the final class be
the same as the rule with the maximum coverage of learning samples; (4) If no rule is activated, , then let the final class
be the default class.
The way of utilizing GIS information in data mining based image classification is quite different from the conventional
way. Conventionally, GIS data are used directly in pre- or post- processing of image classification. In the data mining
based image classification scheme, the knowledge, which was mined from the data, is used. Generally, knowledge is
more generalized, condensed, reliable and easy to understand than data. And a group of rules can represent very
complex non-linear knowledge. Therefore, utilizing knowledge is likely to be more beneficial to improve remote
sensing image classification than utilizing GIS data directly.
4 LAND USE CLASSIFICATION EXPERIMENT
In order to verify the feasibility and effectiveness of the data mining based image classification, a land use classification
experiment is performed in the Beijing area using SPOT multi-spectral image and 1: 100,000 land use database. The
original image is 2412 by 2399 pixels and three bands, which was obtained in 1996. The land use database was built
before 1996, which has land use, contour, road and annotation layers. The original image is stretched and rectified to the
GIS data. The image is 2834 by 2824 pixels after rectification (See Fig.3), which is used as the source image for
classification. We use ArcView 3.0a, ENVI 3.0 and See5 1.10, which is developed based on C5.0 algorithm by
Rulequest Cooperation. And also we developed several programs for data processing and format conversion using
Microsoft C**5.0.
International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000. 241