Kaichang Di
LAND USE CLASSIFICATION OF REMOTE SENSING IMAGE WITH GIS DATA
BASED ON SPATIAL DATA MINING TECHNIQUES
Deren LI, Kaichang DI, Deyi LI*
(School of Information Engineering, Wuhan Technical University of Surveying and mapping,
No. 129 Luoyu Road, Wuhan, P. R. China, 430079)
(*Institute of China Electronic System Engineering, No.6, Wanshou Road, Beijing, P. R. China, 100036)
Email: dli dns.wtusm.edu.cn kcdi 9 public3.bta.nat.cn
Key Words: Data Mining; Knowledge Discovery, Land Use Classification, Inductive Learning, Learning Granularity
ABSTRACT
Data mining techniques are studied to discover knowledge from GIS database and remote sensing image data in order to
improve land use classification. Two learning granularities are proposed for inductive learning from spatial data, one is
spatial object granularity, the other is pixel granularity. The characteristics and application scope of the two granularities
are discussed. We also present an approach to combine inductive learning with conventional image classification
methods, which selects class probability of Bayes classification as learning attributes. A land use classification
experiment is performed in the Beijing area using SPOT multi-spectral image and GIS data. Rules about spatial
distribution patterns and shape features are discovered by C5.0 inductive learning algorithm and then the image is
reclassified by deductive reasoning. Comparing with the result produced only by Bayes classification, the overall
accuracy increased 11 percent and the accuracy of some classes, such as garden and forest, increased about 30 percent.
The results indicate that inductive learning can resolve the problem of spectral confusion to a great extent. Combining
Bayes method with inductive learning not only improves classification accuracy greatly, but also extends the
classification by subdivide some classes with the discovered knowledge.
1 INTRODUCTION
The integration of remote sensing and GIS is a topic of general interest in the field of photogrammetry, remote sensing
and GIS. It is mainly contributes to two kinds of applications. One is GIS database updating by remote sensing images,
the other is remote sensing analysis by the support of GIS data. These two aspects complement each other to make the
GIS databases updated continually.
It has been long acknowledged that GIS data can be used as auxiliary information to improve remote sensing image
classification. In previous studies, GIS data were often used in training area selection and post processing of
classification result or acted as additional bands. Generally, it is accomplished in a statistical or interactive manner, so
that it is difficult to use the auxiliary data automatically and intelligently. If the classifier does not request that the data
have certain statistical characteristic, it is a simple and feasible way to use the auxiliary data as additional bands. But if
the classifier requests certain statistical characteristics, the additional band method can not be used because most
auxiliary data do not meet the requirements of statistical characteristics.
On the other hand, expert system techniques were incorporated in remote sensing image classification to make use of
domain knowledge and logical reasoning. But building an expert system was very difficult because of the “knowledge
acquisition bottleneck”. The traditional way of knowledge acquisition is that the knowledge engineer talks with the
domain expert and then represents and inputs to computer in a formal format. This is usually a long and repeated
process that can not avoid missing of information. Consequently, it is very difficult to put an expert system into
practical use in remote sensing image classification.
In fact, large amounts of knowledge that can be used in image classification are hidden in GIS databases. Some
knowledge is “shadow”, which can be extracted by GIS query. For example, “Is there any river in a area?”, “What is the
maximum and minimum width of the roads?”, and so on. Some other knowledge is “deep”, such as spatial distribution
rules, spatial association rules, shape discriminate rules, etc., that is not stored explicitly in the database but can be
mined by computation and learning.
Spatial data mining and knowledge discovery (SDMKD), is the extraction of implicit, interesting spatial or non-spatial
patterns and general characteristics. In [Li, et al., 1997], we proposed a theoretical and technical framework of spatial
data mining and knowledge discovery. And spatial data mining is supposed to be used in two aspects, one is intelligent
This research was supported by Ph.D. program foundation from Ministry of Education of China and research grant
(WKL(97)0302) form National Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing
238 International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000.