ISPRS Workshop on Service and Application of Spatial Data Infrastructure, XXXVI (4/W6), Oct.14-16, Hangzhou, China
landscape distribution in this area using the knowledge based
classification model.
2.2 Data Source
Since the year 1999 we have finished several field surveys
(1999, 2003, and2004) and collected over 500 sample points.
Also we got field data from Third National Giant Panda Survey
containing about 1500 sample points. In this study we selected
750 points in the study area as the reference data and all the
sample points took the error of no more than 10m (Figure 1).
According to the field survey, 9 types of landscape were
specified: conifer forest (CF), mixed broadleaf and conifer
forest (MBC), broadleaf forest (BF), bamboo (BAM),
shrub/grass/herb (SGH), farmlands (FAR) settlements (SET),
water (WA), rock and bare land (RB). To check the result of the
classification, in this study we use 375 points in classification
and 375 ones for accuracy analysis.
An ETM+ satellite image acquired on May 22 nd , 2001
containing 7 bands (1, 2, 3, 4, 5, 7, and 8) was used as the main
classification data in this study (Figure 1). Also we collected
other spatial information such as NDVI distribution DEM,
slope and aspect data, distance to the roads and rivers
distribution in the reserve (Table 1). All the sample points and
spatial data were integrated into the ArcGIS environment (UTM
projection, WGS84 datum).
Figure 1. The study area of our research. The Foping NR lies in
the southern slope of the Qinling Mountains founded for giant
panda conservation. We select an area of 10*10 km in the
middle of this reserve as our study sample. The main data
source is an ETM+ image acquired on May 22nd, 2001 (this
map used its combination of band 3, 2, and 1). We select 750
sample points as the reference data.
2.3 C5.0 Decision Tree
The decision tree (DT) learning model was more and more often
used in RS classification these years (Huang and John, 1997;
Eric et. al., 2003; Liu et. al., 2005). The advantages that DTs
offer include an ability to handle data measured on different
scales, no assumptions concerning the frequency distributions
of the data in each of the classes, flexibility, and ability to
handle non-linear relationships between features and classes
(Friedl and Brodley, 1997). DTs could be trained quickly, and
are rapid in execution. Besides, according to the DT learning
model, knowledge could be realized with high accuracy (Shi,
2002).
Data
Description
Scale/Precision
Sample
points
Gathered from the study since
2003
10m
RS
image
ETM+ (band 1,2, 3,4, 5,7,
and 8) Acquired on May 22 nd ,
2001
28.5m (band 1,
2, 3, 4, 5, and
7); 14.25m
( band 8)
NDVI
Derived from the RS image
(TM4-TM3)/(TM4+TM3)
28.5m
DEM
Digitized form the paper map
(1:50000)
25m
Aspect
Calculated from DEM
25m
Slope
Calculated from DEM
25m
GIS data
Distance to roads and rivers
distribution in the reserve
10m
Table 1. The data source used in this research. We collected
sample points as reference information. An ETM+ image and
the calculated NDVI were used as the main data set. Also we
acquired the DEM, aspect and slope data, road and river data to
construct a classification data set.
DT uses a multi-stage or sequential approach to the problem of
label assignment. Sets of decision sequences form the branches
of the DT, with tests being applied at the nodes. The leaves (or
branch termini) represent class labels (Figure 2). In this study a
See5 DT model based on C5.0 algorithm was used to acquire
the knowledge from the data sets. The C5.0 algorithm is a kind
of univariate DT improved from the ID3 algorithm, which
selects the branch feature according to the decrease rate of the
information uncertainty calculated by equation 1 (Quinlan,
1993):
H(X /a) = p(a = dj)p(C i /a = aj) log p(C, /a = a,) (1 )
i i
Where a = the value of one feature
C = the class label
H (X/a) = information uncertainty of feature a
The feature with the minimal H(X/a) will be selected as the
branch one.
2.4 Rule base knowledge classification
The rule-type of knowledge could be used for classification
more effectively than the tree-type. So the classification tree
would be converted into rules after finished DT learning in See5.
In this study the knowledge engineer in ERDAS 8.7
environment was used to build the knowledge base. In this
engineer, all the classes will be treated as hypothesises and will
be concluded from several conditions of variables according to
the rules we got from the DT (Figure. 3).