Full text: XIXth congress (Part B3,1)

ial 
all 
nt. 
ng 
he 
ng 
T 
he 
to 
1€ 
he 
on 
be 
ial 
ial 
Kaichang Di 
  
analysis of GIS data, the other is to support knowledge driven interpretation and analysis of remote sensing images. 
SDMKD provides a new way of knowledge acquisition for remote sensing image classification. Some researchers have 
done valuable work in this field. Eklund et al. extracted knowledge from TM images and geographic data in soil salinity 
analysis using inductive learning algorithm C4.5 [Eklund et al., 1998], Huang et al. extracted knowledge from GIS data 
and SPOT multispectral image in wetland classification using C4.5 too [Huang et al., 1997]. In these two studies, 
geographic data were converted from vector to raster format in which the sampling size is equal to image pixel size. The 
implementation of data mining techniques in spatial database, especially inductive learning method, and the 
combination or integration of inductive learning with traditional image classification methods, are still need to be 
further studied. 
In this paper, data mining techniques are studied to discover knowledge from GIS database and remote sensing data in 
order to improve land use classification of images. The paper is organized as follows. Section 2 describes the implement 
of inductive learning in spatial databases. Section 3 presents the methods of inductive learning in remote sensing image 
classification. Section 4 describes an experiment of land use classification of SPOT multispectral image. Finally we 
come to a conclusion. 
2 INDUCTIVE LEARNING AND ITS INPLEMENT IN SPATIAL DATABASE 
There are a lot of methods can be used in spatial data mining [Li, et al., 1997], among them inductive learning is a most 
import one. And there are many inductive learning algorithms which mainly come from the field of machine learning, 
for example, AQ11 and AQI5 by Michalski, AEI and AE9 by Jiarong Hong, CLS by Hunt, ID3, C4.5 and C5.0 by 
Quinlan, CN2 by Clark, etc [Hong, 1997]. ID3 series, including ID3, C4.5 and C5.0, are most famous and influential. 
ID3, which is a kind of decision tree algorithm, adopts a strategy of “divide and conquer". It selects classification 
attributes recursively based on information entropy [Quinlan, 1993]. ID3 runs fast in learning and classification, this 
makes it effective for large database. The shortcoming of ID3 is that the decision tree is not clear as production rules, 
especially when a decision tree is large, it is very difficult to understand what does the tree mean. The other 
shortcoming is that ID3 can only deal with discrete attributes and it is restricted to two-class problems. C4.5, which is a 
extension of ID3, can covert a decision tree to equivalent production rules and can deal with multi-class problem with 
continuous attributes. These new features make C4.5 practical and most popular in the field of artificial intelligence and 
machine learning. C5.0 is a further improved version of C4.5, which runs much faster in very large databases. 
Therefore, we study the implementation of inductive learning in spatial database using C5.0 algorithm. 
C5.0, as many other inductive learning algorithms, require that the training data are composed of several tuples and 
each tuple has several attributes one of which is class label. If we treat records as tuples and fields as attributes, these 
algorithms are very suitable for learning in relational database. Spatial data structure is more complex than the tables in 
ordinary relational database. Besides tabular data, there are vector and raster graphic data in spatial database. And 
generally, the features of graphic data are not explicitly stored in the database. Therefore, learning in spatial database is 
more difficult than learning in ordinary relational database in selecting the tuple and attributes of training data. 
We regard learning tuple selection as a problem of determining learning granularity. Two learning granularities are 
proposed for inductive learning from spatial data, one is spatial object granularity, the other is pixel granularity. Spatial 
object represents area, line and point objects in graphical database or area and linear features extracted from remote 
sensing images. Pixel simply means the pixels of remote sensing images or cells of raster graphic data. Learning in 
spatial object granularity can discover knowledge concerning location, shape, spatial relation, etc. The discovered 
knowledge is generalized and can be used in intelligent spatial data analysis and also in remote sensing image 
classification. When the discovered rules are pplied to image classification, the image must be clustered or pre- 
classified to area or linear features before the rules are used. Learning in pixel granularity, on the other hand, can 
discover knowledge about spectral, location, elevation, etc. The discovered rules are more specialized and suitable for 
image classification, but not suitable for spatial data analysis and decision support. The two kinds of granularities have 
their own shortcomings as well. Learning in pixel granularity can not utilize shape information and it is difficult to 
utilize spatial association information. Learning in spatial object granularity can not utilize the detail information within 
the object, for example learning in polygon granularity can not utilize the accurate elevation and slope value within a 
polygon, and can only use a average or sample value. These two kinds of granularities should be selected for different 
applications or may be used together. 
After determining the learning granularity, the learning attributes should be determined. In ordinary relational 
databases, the attributes can be the fields explicitly stored or derived fields by mathematical or logical operation. On the 
contrary, the geometric features and spatial relations are not stored explicitly in spatial database, but hidden in the 
multi-layer graphic data. Spatial analysis and spatial operation must be performed to extract the attributes about shape 
  
International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000. 239 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.