2.2 Model Quality Parameters
Processing models used in a. GIS. may be
deliberately simplified or even wrong. In the
established GIS systems models are usually
inserted at the time of using rather than stored.
The interface of the GIS could prompt the user for
information on the model, requesting information
on variables to be processed and their functional
relationships. But such an interface could also
prompt the user for information on the model
quality, if a Processing Model Quality Report does
not already exist. Models can be checked to
determine their quality, and commonly this
involves fieldwork (see [DRUMMOND,1991]).
2.3 Manipulation of Position and Attribute Quality
Parameters
Error Propagation techniques have long been used
by surveyors and photogrammetrists in their
techniques of pre-analysis, to estimate the most
probable error of a data capturing task. These
techniques involve manipulating the standard
deviations of independent variables to obtain an
estimate of the dependent variable, and so are
more correctly called Variance Propagation. In any
general approach to handling data and information
quality in GIS it is proposed that the term Error
Propagation be used when estimates of the quality
of the result of a procedure are being obtained,
and the term Variance Propagation be reserved only
for the situation when the error propagation has
followed the methods outlined below (and well
established in surveying and photogrammetry, but
not so much in GIS!). Thus variance propagation
[MIKHAIL, 1976] may be used to estimate the
quality of GIS generated information, when a
mathematical model is being used. A. mathematical
model processes continuous variables and constants
to provide new information and the mean and
standard deviation of such continuous variables
can be stored in a relation such as TABLE 1,
columns VALUE2 and QUAL2.
Briefly reminding the reader, variance propagation
of the given mathematical model:
a= f(b;cC) i ol... iiss zen (1)
where values of ’b’ and ’‘c’ are stored in the
database tables of a GIS, and the new information
‘a’ can be computed, then if the values of the
Standard Deviations (SDs) of ’b’ and ’c’ are also
stored in the database, the SD of ‘a’ can be
estimated:
(SDa)? = (SDb)2 x (da/db)2 + (SDc)2 « (da/dc)?
* 28Dbc(da/db)(da/dc) . ........ (2)
the last term being omitted if there is no
correlation between b and c.
As indicated, the equations ((1) and (2)) above
use information available from a database table
such as TABLE 1. However the user required
information ('a') and the partial derivatives
(e.g. (da/db)) used in equation (2) both need the
model (ie equation (1)) to be provided.
To perform a variance propagation partial
derivatives must either be supplied by the user or
the system, and for the general GIS user help must
be given. Useful commercial subroutines exist
which can determine these, and can be incorporated
into a GIS.
358
Set Theory can be used to process quality
information when a Logical model is used. Such a
model is, for example:
Grazing Suitability 1 arises when
soil class is Hn33 and when
rooting depth is 1.50m to 2.00m
Vith such a model, error propagation may exploit
Crisp Set Theory and considering the parcel 1255
of TABLE 1, the probability that the soil class is
Hn33 is 0.65. The probability that the rooting
depth is between 1.50mm and 2.00mm is 0.98. Thus
assuming the model is perfect (i.e. the
probability of the model holding is 100%), then
the probability of the soil polygon being Grazing
Suitability 1 is 0.64 (or 0.65 x 0.98). If the
probability of the model holding is only 80%, then
the probability of the soil polygon being Grazing
Suitability 1 is 51%. This is a problem of Set
Theory Intersection, more fully described in texts
on Probability (e.g. [BHATTACHARYYA and JOHNSON,
1971].
The probability that the soil polygon had a
rooting depth in the class 1.50m - 2.00m of 98%
vas obtained by the technique of estimation by
confidence intervals, which makes certain
assumptions about error, the most significant
being that:
1. error associated with a measurement is normally
distributed about the mean of that measurement;
and,
2. the function describing a normal distribution
of | error can be used to determine the
percentage of the total area under that error
distribution curve between any two values of x.
These assumptions lead to FIGURE 1 which shows a
normal distribution of Rooting Depth measurement
error about a mean of 1.80m, when a Standard
Deviation of 0.10m had been achieved. From this
diagram it can be seen that the bottom edge of the
Rooting Depth Class 1.50 - 2.00 is 3«SD below the
mean (1.80m) while the top edge of the class is
2xSD above the mean. This accounts for 98% of the
area under the error distribution curve of FIGURE
1, leading to the assumption that the probability
of a soil polygon, whose mean rooting depth is
1.80m, being in the Rooting Depth Class 1.50-2.00m
is 0.98. Such a computation can be triggered by
the presence of ’F’ in a DISn column of a database
table such as TABLE 1, and such a capability is an
essential part of an uncertainty subsystem
Crisp Set Theory uses probabilities which have
been derived through objective and repeatable
procedures. On the other hand Fuzzy (Sub-) Set
Theory, although based also on probability theory,
has been developed to use Certainty Factors, which
may be probabilities, but (as implemented in some
expert systems) gut feelings, hunches, or other
types of unrepeatable (or non-objective) expertise
are encouraged as acceptable sources. Certainty
Factors range from 0.0 to 1.0 - adopting Kaufman's
approach [KAUFMANN, 1975]. Using the probabilities
discussed above, but treating them as Certainty
Factors, we have:
Certainty Factor that soil class is Hn33 = 0.65
Certainty Factor that the rooting depth
is in the class 1.50m to 2.00m = 0.98
Certainty Factor of the model holding =
Thus,
being (
This si
Sub-Set
in [KAI
Propos:
Land
comple
or rei
land |
partit.
holdin;
guaran!
certaii
holdin,
includ
suitab
model
3.1 Gr.
The m
1991]
The th
1. soi
2. soi
and
3. top
Figur