positional quality;
attribute quality;
lineage;
completeness; and,
logical consistency.
NS CN rm
€ 9 We ats
The two first refer to data describing INDIVIDUAL
real world entities, with 4) and 5) referring to
SETS of real world entities. The concept of
"lineage" includes but also goes beyond "temporal
data" (or "when data") to include information on
how data were generated and what processing they
As already mentioned, an entity’s attributes may
be recorded by continuous variables or by
discontinuous variables. A quality parameter
associated with a continuous variable is, of
course, Standard Deviation, while quality
parameters associated with discontinuous variables
are certainty statistics such as Probability or
Certainty Factor. Influenced by ideas presented in
1989 [GUPTILL, 1989] we propose that all
attributes should have a quality parameter stored
with them, as shown in TABLE 1. Such an approach
is extremely flexible, as such database tables are
have undergone (or "how data"), and also "by whom
data". A "lineage" could apply to an individual
real world entity as represented in the database,
but more usefully a set of such entities.
Concerning "completeness", if a GIS purports,
e.g., to record all storm-drain inspection covers
already accessed in the information generation
procedures used in many GISs.
Continuous variables record position and the
quality parameter for such variables is Standard
Deviation. Although each position can have x and y
igation (and z) standard deviations it is likely that
Jverlay
system
7. The
apie POLYNR PASD PNSD VALUEl1 DIS1 QUAL1 UNIT1 VALUE2 DIS2 QUAL2 UNIT2 DESCRIPT
w Land
ty 12547 25 16 Has T 0.76 1.55 Up 0.10 4 MEMO
1255 25 . 16 Hn33 T 0.65 1.90 F 0.10 M memo
lality, 1256 25 16 —Hn31 "T 0.82 1.70 F 0.10 M memo
1257 25-16 . E235 T 0.87 2.00 F 0:10 M memo
1258 25 16 7835 T 0.76 1.390 — T 0:10 M memo
1259 25 16 Hn35 T 0.58 1.55 EF 0.10 M memo
1260: 25 16 "2335 T 0.76 1.60 ^F 0.10 # memo
es to The ’MEMO’ (or Data Dictionary) associated with the SOILPOLS relation is:
uality :
POLYNR unique identifier for the soil polygon
PASD standard deviation of arc coordinates, x and y
patial PNSD standard deviation of node coordinates, x and y
dn a VALUE1 Dutch Soil Classification System soil class
DIS1 T to indicate a discontinuous variable (discontinuous is true)
tities QUAL1 probability associated with the soil class being correct
UNIT1 no units for soil class
VALUE2 soil rooting depth
data; DIS2 F to indicate a continuous variable (discontinuous is false)
QUAL2 Standard Deviation of soil rooting depth measurement
UNIT2 units for soil rooting depth (meters)
eters,
(or,
Table 1 - The relation (or database table) soilpols
a real
ed. In
E. by in a city - completeness indicates the percentage certain types of points (e.g. node points for
ed by of these covers actually recorded. Concerning roads) within a large data set could have common x
lso be "logical consistency", specific data processing and y standard deviations associated with them,
or tasks (e.g. route selection, or a land-parcel and others (e.g. arc points for rivers) different
mporal ownership query) assume certain characteristics of x and y standard deviations. This proposal is
3 were the data (e.g. that all connected road segments easily implemented in an ‘off-the-shelf’ GIS, as
meet at common nodes, that all parcels have a such „Standard deviations can also be stored in a
: unique identifier), and it should be known to the relation such as TABLE 1 above. There is no need
ytical GIS user to which extent these characteristics are to ‘break into’ the sometimes proprietary
GIS to met: Structure of coordinate files - at least in the
to be preliminary stages of developing a quality
gical 2.1 Storage of position and attribute quality subsystem. Temporal data could also be stored in
pulate | parameters TABLE 1 - although we have not done so, believing
(e.g. that this can be handled from the Lineage Report
[nous We have proposed methods for storing positional [RAMLAL, 1991].
itude) and attribute quality by data item (see
ta and [RAMLAL,1991] [DRUMMOND,1991]) and Lineage,
Completeness, and Logical Consistency Reports by
: data set. We have also proposed a Processing Model
It is Quality Report [RAMLAL,1991], but as will be
AGHAN, indicated in section 2.2 this may not always be
ts of useful.
357