can
1g à
t be
with
lata.
the
are
lity
data
rele
teil,
r für
zT
'aher
lung
Die
iden
ung,
von
ben,
innel
nsed
bing
]uery
the
their
rs to
vhich
ality.
wing
, and
that
even
se to
ional
and
these
y. To
loped
r, aD
'
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004
representations appear like a DEM, but instead of terrain
height the corresponding data quality attribute is displayed.
The remainder of this paper is structured as follows: section
two explores previous work on spatial data quality, the
theory behind visualization methods and their combination,
the use of visualization to convey data quality, and it
describes existing projects on data quality visualizations.
Section three discusses the selection of the quality
attributes we chose in our approach. In section four we
discuss effective visualization methods and introduce the
visualizations that we have developed. Section six provides
conclusions and future work.
2. RELATED WORK
In the last two decades data quality has become an important
research topic. Scientists argued that users of spatial data
should have access to data quality information
(McGranaghan, 1993; Buttenfield and Beard, 1994; Beard,
1997). Soon it became obvious that the nature of spatial data
lends itself perfectly to the communication of quality
parameters by visualization in the form of images and
graphics. As a result, the call for visualization of data
quality surfaced (Beard and Mackaness, 1993; van der Wel et
al, 1994).
Since the early nineties researchers took formal approaches
to the visualization of spatial data quality (Clapham, 1992).
The National Center for Geographic Information and
Analysis devoted a lot of energy in exploring this area and
spearheaded a research initiative on "Visualization of the
Quality of Spatial Information" (Beard et al., 1991). Results
from this initiative are introduced in (Buttenfield and Beard,
1991).
In the remaining part of our literature review we present
various terms used to describe data quality aspects, and we
discuss related visualization approaches and past project
efforts.
2.1 Discussion of Terminology
In the literature a substantial number of expressions are used
to describe data quality, namely quality, error, reliability,
uncertainty, validity, accuracy, vagueness precision and
fitness for use.
The term quality is used as an umbrella-term that covers all
aspects of the issue. It is used by practically everybody in
the field (Beard, 1997; Veregin, 1999). The use of the term
error is also widely used, and there is broad consent on what
the word describes when used for image data, namely the
difference between true value and the value stored in the
database (Hunter and Goodchild, 1995; Buttenfield, 1993).
Reliability can be defined as the level of confidence a data
provider has that the data are correct (Evans, 1997).
The term uncertainty is used in various ways, one being that
the resolution of the data does not allow a user to make an
assured decision about the content of the data. For example,
pixels in remotely sensed images might contain uncertain
information because of sub-pixel mixing or sensor sampling
bias (Bastin et al., 2002). Worboys and Duckham (2004) use
the term uncertainty to describe the doubt that users have
about the right use of data. In this sense it is a measure that
describes the state of mind of the user.
Other terms that are used to describe different outlooks on
data quality are validity (Goodchild et al., 1994), and
accuracy (Veregin, 1999). Vagueness describes the
impossibility to determine the exact location or boundary of
an object in space (Duckham et al., 2001). For example ‘the
East of Maine’ is a vague area in that its boundaries are not
exactly determinable. Precision denotes the exactness with
which the measurement is made that led to the entry in the
database (MacEachren, 1992). An overall phrase that is used
frequently is fitness for use. It indicates whether the data has
the specifications that the users need to solve their task
(Paradis and Beard, 1994).
2.2 Visualizations
Beard and Buttenfield (1999) listed the following
challenges in the visualization of data: graphic design,
metadata, error analysis, and user satisfaction. In this
research we concentrate on the graphic design issues. For the
combined display of data and data quality three possible
forms are mentioned in the literature (MacEachren, 1994;
Beard and Buttenfield, 1999). First, there are side-by-side
images, where one picture shows the data and the other one
the quality of the data. The second approach is to generate
composite images that display data quality superimposed
on the visualization of the data. Thirdly, sequenced images
of data and data quality can be presented, either affording
the user to toggle between the displays or providing an
animation (Evans, 1997).
The following two visualization approaches have also been
discussed: variation in color hue and saturation to convey
the quality of data (Schweizer and Goodchild, 1992; Howard
and MacEachren, 1996), and, showing quality attributes as
the z-axis in a 3D elevation model, which was mentioned as a
worthwhile endeavour by van der Wel et al. (1994) without
any follow-up projects implementing the idea. We take up
the concept of the latter approach and incorporate it in our
3D visualizations.
2.3 Previous Projects
The following works concentrate on the communication of
quality of geospatial data. The R-VIS project introduces a
model which shows the reliability of water quality data
(Howard and MacEachren, 1996). A visualization of
uncertainty in meteorological forecast models was also
developed showing the discrepancy of multiple weather
forecasts over time (Fauerbach et al., 1996). Various graphs,
bivariate images and animations are used in the FLIERS
project to visualize uncertainty in multi-spectral remotely
sensed imagery (Bastin et al., 2002). Davis and Keller (1997)
offer quality information for risk management decisions.
Spatial data uncertainty was also communicated using
animation (Ehlschlaeger et al., 1997).
3. DATA QUALITY ATTRIBUTES
Metadata contain a wealth of information about the data at
hand. From the attributes that are typically described by
metadata information we selected the ones that convey data
quality, and more specifically, those which pertain to
geospatial image quality. Our goal has been to display the
optimum number of essential data attributes, avoiding
redundancies which could confuse the user. We based our
selection on the US Spatial Data Transfer Standard's (SDTS)
section on data quality (NIST, 1992), which is quite
1003