A MULTISCALE APPROACH TO DETECT SPATIAL-TEMPORAL OUTLIERS
Tao Cheng
Zhilin Li
Department of Land Surveying and Geo-Informatics
The Hong Kong Polytechnic University
Hung Hom, Kowloon,
Hong Kong
Email: {lste; Iszlli}(@polyu.edu.hk
WG: THS9 Uncertainty, Consistency and Accuracy of Data and Imagery
Keywords: Detection, Dynamic, Multitemporal, Multiresolution, Thematic, Quality
ABSTRACT
A spatial outlier is a spatial referenced object whose non-spatial attribute values are significantly different from those of other
spatially referenced objects in its spatial neighborhood. It represents locations that are significantly different from their
neighborhoods even though they may not be significantly different from the entire population. Here we adopt this definition to
spatio-temporal domain and define a spatial-temporal outlier (STO) to be a spatial-temporal referenced object whose thematic
attribute values are significantly different from those of other spatially and temporally referenced objects in its spatial or/and
temporal neighborhood. Identification of STOs can lead to the discovery of unexpected, interesting, and implicit knowledge, such
as local instability. Many methods have been recently proposed to detect spatial outliers, but how to detect the temporal outliers
or spatial-temporal outliers has been seldom discussed. In this paper we propose a multiscale approach to detect the STOs by
evaluating the change between consecutive spatial and temporal scales.
1. INTRODUCTION
Outliers are data that appear inconsistent with respect to the
remainder of the database (Barnett and Lewis, 1994). While in
many cases these can be anomalies or noise, sometimes these
represent rare or unusual events to be investigated further. In
general, there are three direct approaches for outlier detection:
distribution-based, depth-based and distance-based.
Distribution-based approaches use standard statistical
distribution, depth-based technique map data objects into an
m-dimensional information space (where m is the number of
attribute) and distance-based approaches calculate the
proportion of database objects that are a specified distance
from a target object (Ng 2001).
A spatial outlier is a spatial referenced object whose non-
spatial attribute values are significantly different from those
of other spatially referenced objects in its spatial
neighborhood. It represents locations that are significantly
different from their neighborhoods even though they may not
be significantly different from the entire population (Shekhar,
et al, 2003). Identification of spatial outliers can lead to the
discovery of unexpected, interesting, and implicit knowledge,
such as local instability.
Many methods have been recently proposed to detect spatial
outliers by the distribution-based approach. These methods
can be broadly classified into two categories, namely 1-D
(linear) outlier detection methods and multi-dimensional
outlier detection methods (Shekhar, et al, 2003). The 1-D
outlier detection algorithms consider the statistical
distribution of non-spatial attribute values, ignoring the spatial
relationships between items. The main idea is to fit the data
set to a known standard distribution, and develop a test based
on distribution properties (Barnett and Lewis, 1994; Johnson,
1992). Multi-dimensional outlier methods can be further
1008
grouped into two categories, namely homogeneous multi-
dimensional metric based methods and spatial methods. The
homogeneous multi-dimensional metric based methods do not
distinguish between attribute dimensions and geo-spatial
dimensions, and use all dimensions for defining neighborhood
as well as for comparison. In the spatial methods, spatial
attributes are used to characterize location, neighborhood, and
distance, and non-spatial attribute dimensions are used to
compare a spatially referenced object to its neighbors. Among
others, Shekhar et al (2003) developed a unified modeling
framework and identify efficient computational structure and
strategies for detecting spatial outliers based on a single non-
spatial attribute from a data set.
Depth-based techniques are also applied extensively as
clustering for spatial outlier detection, i.e. identifying the
neighborhood of an object based on spatial relationship, and
considering the proximity factor as the main basis for
deciding if an object is an outlier with respect to neighboring
objects or to a cluster. The limitation of these approaches is
ignoring the influence of some of the underlying spatial
objects that might be different at different spatial locations
despite the close proximity, i.e., the semantic relationship is
not considered in the clustering. An exception is that, Adam et
al. (2004) identified spatial outliers by taking into account the
spatial and semantic relationships among the objects.
Ng (2001) used distance-based measures to detect unusual
paths in two-dimensional space traced by individuals through
a monitored environment. These measures allow the
identification of unusual trajectories based on entry/exit
points, speed and geometry; these trajectories may correspond
to unwanted behaviors such as theft. Other methods used in
data mining such as classification and aggregation, are also
applied in spatial outlier detection (Miller, 2003).
Intern
ve
2.]
Th
ba
clt
kn
ba:
cla
(se
ne
dai
reg
2.2
Th
in
mi
2.3
Th
wi