)12
NCE
rdogan,
tions and
jlete data
countable
bases and
costs and
different
metadata,
producers
ling, data
eral rules
hold, data
aken into
checking
isers. But
11 method
pplication
ealization
discusses
'oduction.
ic Treaty
both data
into three
of quality
at can be
y control,
ty control
1ecks that
ly, these
at allow
required.
he terrain
mapping.
mentation
odata and
cts to get
terms of
is a must.
make sure
:d quality.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B4, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
Such a program has to cover a cycle of procedures for quality
control that warranties to keep the specified quality. Each
phase in quality control assessments defines one aspect of
quality requirements for the underlying information model.
Thus, quality management should be the initial phase in data
processing, data analysis, maintenance or homogenization of
different data sets to make sure a well-defined result in any of
these processes. (Busch and Willrich, 2002)
Many studies are performed about the quality of geodata. In
one of these studies, a framework to support quality driven
large-scale geospatial data integration (QGM) was described
by Thakkar et al (2007). The key contributions of their
framework are: (1) the ability to automatically estimate quality
of data provided by a source by using the information from a
source of known quality, (2) declarative representation of both
the content and the quality of geospatial data provided by
sources, and (3) a quality-driven query answering technique for
geospatial data. Their experimental evaluation using over 1200
real-world sources show that QGM not only provides better
quality data compared to the traditional data integration
systems, it also has lower response time.
Spatial data quality is well-known by academia and industry
but usually in different context. The research on spatial data
quality stated several issues having practical use such as
descriptive information as metadata, fulfilment of spatial
relationships among data, integrity measures, geometric
constraints etc. The industry and data producers realize them in
three stages; pre-, co- and post data capturing. The pre-data
capturing stage covers semantic modelling, data definition,
cataloguing, modelling, data dictionary and schema creation
processes. The co-data capturing stage covers general rules of
spatial relationships, data and model specific rules such as
topology and model building relationships, geometric
threshold, data extraction guidelines, object-object, object-
belonging class, object-non-belonging class, class-class
relationships to be taken into account during data capturing.
And post-data capturing stage covers specified QC benchmarks
and checking compliance to general and specific rules.
Vector data represents one major category of data managed by
GIS. Based on geo-spatial data standards and integrity rules
GIS vendors and data producers build QC and QA guidelines
and apply them in production workflow. The vector data
quality criteria are different from the views of producers and
users. But these criteria are generally driven by the needs,
expectations and feedbacks of the users.
This paper presents a practical method which closes the gap
between theory and practice. Development of spatial data
quality concepts into developments and application requires
existence of conceptual, logical and most importantly physical
existence of data model, rules and knowledge of realization in
a form of geo-spatial data. The applicable metrics and
thresholds are determined on this concrete base.
This study discusses application of geo-spatial data quality
issues and QC procedures in the topographic data production.
Firstly we introduce MGCP data profile of NATO DFDD, the
requirements of data owner, view of data producers for both
data capturing and QC and finally quality assurance to fulfil
user needs. Then, our practical and new approach which
divides the quality into three phases is introduced. Finally,
23
implementation of our approach to accomplish metrics,
measures and thresholds of quality definitions is discussed. In
this paper, especially geometry and semantics quality and
quality control procedures that can be performed by the
producers are discussed.
2. QUALITY ASSURANCE OF VECTOR DATA
2.1 What is Vector Data?
Vector data provide a way to characterize real world features
within the GIS environment. A feature is anything you can see
on the landscape. Think about you are at a high position on the
field. When you look at down, you can see many features like
forests, houses, roads, trees, rivers etc. Each of these things
would be a feature when we characterize them in a GIS
program. Vector features have attributes, which consist of text
or numerical information that describe the features.
A vector feature has its shape represented using a special
geometry. The geometry is made up of one or more
interconnected vertices. A vertex describes a position in space
using a x, y and optionally z axis. Geometries which have
vertices with a z axis are often referred to as 2.5D since they
describe height or depth at each vertex, but not both. When a
feature's geometry consists of only a single vertex, it is referred
to as a point feature. Where the geometry consists of two or
more vertices and the first and last vertex are not equal, a
polyline feature is formed. Where four or more vertices are
present, and the last vertex is equal to the first, an enclosed
polygon feature is formed. (www1, 2011)
2.2 Quality of Vector Data
The quality elements described in ISO (International
Organization for Standardization) 19113 are completeness,
logical consistency, positional accuracy, temporal accuracy,
thematic accuracy. Also they can be defined as measures of
quality such as closeness to the actual value, spatial accuracy
of position, shape, size/area of features, "currentness" of data
and completeness of attribute values. Another side of quality
should be thought as absence of contradictions in the data and
conformance of the data to rules.
Also there are other classifications made for the quality of
vector data. In one of these classifications made by Subbiah et
a. (2007), quality parameters are defined as accuracy,
resolution, completeness, and types. Accuracy of geospatial
data is defined in terms of tuple (attribute, value), where
attribute refers to a geographic concept/object and the value is
its measurement. It is assumed that geospatial service
providers provide data that conform to such tuples and that
there is an objective assessment of all concept values.
Resolution refers to the amount of detail that can be
determined in space, time or theme. Vector data can be
represented in either fine or coarse density. Scale of maps can
be thought as resolution. The coarser the data is, the less
information is available about vector points of an object’s
shape. Resolution is also related to accuracy because the level
of resolution affects the database specification against which
accuracy is assessed. Completeness refers to the absence of
omissions in a provider database. Completeness is distinct
from accuracy in that the errors that result in lack of
completeness are not incorrect encoding of object values.