FUZZY EVIDENCE THEORETIC APPROACHES FOR KNOWLEDGE DISCOVERY
IN SPATIAL UNCERTAINTY DATA SETS
. : 2 1 ons 23
Binbin He? Tao Fang Dazhi Guo^
' Institute of Image Processing & Pattern Recognition, Shanghai Jiao Tong University, No.1954 Huashan Road,
Shanghai, China 200030, binb_he@163.com, tfang@sjtu.edu.cn
Department of Environment & Spatial Informatics, China University of Mining and Technology, Xuzhou, JiangSu,
China 221008, guodazhi@pub.xz.jsinfo.net
KEY WORDS: Data mining, Reasoning, Algorithms, Transformation, Representation, Visualization
ABSTRACT:
Although uncertainties exist in spatial knowledge discovery, they have not been paid much attention to. In the past years, the most
researches of spatial knowledge discovery focused on the methods of data mining and its algorithms. In this paper, uncertainty and
its propagation of spatial data are discussed and analysed firstly. Then, uncertainties at various stages of spatial knowledge discovery
are briefly analysed. including data selection, data preprocessing, data mining, knowledge representation and uncertain reasoning.
Thirdly, a method of spatial knowledge discovery in conjunction with uncertain reasoning by means of fuzzy evidence theory is
proposed. Herein, the framework for uncertainty handling in spatial knowledge discovery is constructed, and the fundamental issues
include soft discretization of spatial data, fuzzy transformation between quantitative data and qualitative concept, reasoning under
uncertainty and uncertain knowledge representation.
1. INTRODUCTION
Spatial Knowledge Discovery (SKD) is to extract the hidden,
implicit, valid, novel and interesting spatial or non-spatial
patterns, rules and knowledge from large-amount, incomplete,
noisy, fuzzy, random, and practical spatial databases, which
include spatial data mining and uncertain reasoning. In recent
years, the term, "spatial data mining and knowledge discovery”
(SDMKD) has been connectedly used, in which data mining is a
key step or technique in the course of spatial knowledge
discovery. With an efficient and rapid improvement of
automatic obtaining technologies of spatial data, the amount of
data in spatial database have been increased in index movement.
But the deficiency of analysis functions in geographic
information systems (GISs) induces a sharp contradiction
between the magnanimity data and useful knowledge
acquisition, in the other words, “The spatial data explode but
knowledge is poor” (Li, 2002). At present, spatial knowledge
discovery mainly concentrated on the principles and methods of
data mining. Another important issue —uncertainty in spatial
knowledge discovery —have not been paid much attention to.
On the one hand, spatial data itself lies in uncertainty, and on
the other hand, many uncertainties will be reproduced in spatial
knowledge discovery process, even propagated and
accumulated, it lead to the production of uncertain knowledge.
These characteristics had not been considered, and the
knowledge discovered had been regarded as an entirely useful
and certain knowledge in traditional spatial data mining and
knowledge discovery. The role that uncertainty can play in
spatial knowledge discovery probably is more significant than
those in many other research fields, because of the native of
knowledge discovery (which is to find hidden knowledge
patterns from data). It is to convenient to study spatial
knowledge discovery by starting from perfect spatial data with
perfect result. However, spatial data are usually far from perfect,
and the spatial knowledge discovery process itself is full of
various kinds of uncertainty. Spatial knowledge discovery
incorporating uncertainty is important, because it puts the study
of spatial knowledge discovery in more realistic setting. So the
research on the uncertainty of spatial knowledge discovery have
become a very important issue.
Furthermore, uncertain reasoning, as a traditional research area
of artificial intelligence is aimed at developing effective
reasoning method involving uncertainty, namely, to derive what
is behind data even data is incomplete, inconsistent, or with
other problems. Many uncertain reasoning methods, such as
fuzzy set theory, evidence theory, and neural networks, are
powerful computational tools for data analysis and have good
potential for data mining as well. But traditional spatial data
mining and knowledge discovery did not pay attention to these
characteristics. In this paper, on the basis of analysis of
uncertainty in spatial data, uncertainties at various stage of
spatial knowledge discovery were analysed briefly. Especially,
a method of spatial knowledge discovery in conjunction with
uncertain reasoning by means of fuzzy evidence theory is
proposed.
2. UNCERTAINTIES OF SPATIAL DATA
2.1 The Types and Origins of Uncertainty in Spatial Data
[t is said that the uncertainty within spatial data is the major
components and forms for the evaluation of spatial data quality.
Spatial data quality includes lineage, accuracy, completeness.
logical consistency, semantic accuracy and currency (FGDC,
1998). All types of spatial data are subjected to uncertainty,
since it is impossible to create a perfect representation of the
infinitely complex real world (Goodchild, 2003). Error refers
to the discrepancy between observation results and true value,
which has statistic characteristics. Uncertainty is more broadly-
defined error concept continuation, measuring the discrepancy
degree of the surveying objects’ knowledge. Uncertainties in
spatial data can be classified: error, vaguencss, ambiguity and
discord (Fisher, 2003).
258
Intern
The:
survey
repres
parts.
pheno
other
bring
propa
accum
22 1
Data
At pr
some
especi
There
polyg
constr
Acker
1991)
1982)
For la:
empha
attribu
attribu
2000;
uncert
“S-bar
attribu
that pc
in unif
Real W
The un
the pro
data m
The st
very in
data of
databas
Moreo:
indirec:
(Miller