1SPRS, Vol.34, Part 2W2, “Dynamic and Multi-Dimensional GIS", Bangkok, May 23-25, 2001
254
MINING SEQUENTIAL PATTERN FROM GEOSPATIAL DATA
Yin SHAN
Joint Lab. For Geoinformation Science & Department of Geography
The Chinese University of Hong Kong
shanyin@cuhk.edu.hk
Tao CHENG
Department of Geography, University of Leicester, Leicester LE1 7RH, United Kingdom
tcheng@le.ac.uk
Hui LIN
Joint Lab. For Geoinformation Science, The Chinese University of Hong Kong
huilin@cuhk.edu.hk
KEYWORDS ¡Spatio-Temporal Data Mining, Sequential Pattern, GIS
ABSTRACT
Recently spatio-temporal data mining begins to be introduced into Geographical Information Systems (GIS) community. We present a
method to mine sequential pattern from geographical data in this paper. WINEPI algorithm is employed to discovery frequent episodes
in spatio-temporal data.
1. INTRODUCTION
Since our capability of gathering and managing data improved
considerably in recent years, traditional data analysis cannot
make sense of this huge volume of data. Data mining, as a
term which appeared in late 80s, is now an active research
field to facilitate extracting novel, interesting and potentially
useful knowledge from these massive datasets. Data mining,
which is originated for extracting knowledge from transaction
database, is now extended for processing spatial, temporal,
even spatio-temporal data.
Recently spatio-temporal data mining begins to be introduced
into Geographical Information Systems (GIS) community
(koperski 1999; Buttenfield, Gahegan et al. 2000). In fact,
because of its multidimensional and dynamic characteristics,
GIS as a science and collection of techniques for handling
geo-referenced data, is not only a satisfying application
domain for data mining, but also can propose fresh questions
for data mining community. More importantly, GIS as a
powerful tool to manage geographical data, have already had
so much data stored in it. Since geographical data are
collected with high cost, it is more necessary and emergent to
make the most of these data. Data mining, especially spatio-
temproal data mining, should play a notable role in this
process.
Basically, the tasks of data mining include segmentation,
dependency analysis, deviation and outlier analysis, trend
detection, generalization and so on. Its main techniques
consist of cluster analysis, Bayesian classification, decision
tree, neural networks, association rules, outlier detection,
attribute-oriented induction, etc. (Miller and Han 2001)
We present a method to mine sequential pattern from
geographical data in this paper. The discovery of frequent
sequences is important domain of temporal mining.
Correlations are discovered among events ordered on the time
axis. In many applications, sequence mining is used to assess
after which events an interesting event is expected to occur. In
the context of GIS, the sequential pattern of spatial changes of
geographical objects depicts the correlation, maybe potential
causal relation, among changes. These correlation can not
only be used for prediction, but also motivate scientist to look
for explanation and mechanism behind the correlation, which
will enrich our knowledge in the end.
The rest of this paper is organized as follows. In the next
section our method for mining sequential pattern from
geospatial data is presented. Three sections that follow
discuss the three steps of our method separately. At last,
conclusion and future research issues are given.
2. A METHOD FOR MINING SEQUENTIAL PATTERN
FROM GEOSPATIAL DATA
Generally speaking, spatio-temporal data mining is an active
but immature field. Spatial generalisation, spatial clustering
and spatial associations can be extended to deal with temporal
information (Abraham and Roddick 1998). In order to be able
to handle time as well as space, some new rule types, such as
meta-rule and evolution rule are also proposed. (Abraham
1999) In this paper, we propose a different method. We
introduce WINEPI algorithm (Mannila, Toivonen et al. 1997) to
discovery frequent episodes in spatio-temporal data.
Abstractly, data with temporal attributes can be viewed as a
sequence of events, where each event has an associated time
of occurrence. One basic problem in analyzing such a
sequence is to find frequent episodes, i.e., collections of
events occurring frequently together (Mannila, Toivonen et al.
1997). We believe this analysis could be introduced into
mining geospatial data to find the sequential pattern among
spatial changes.