In a GIS the link between questions and answers has to
be made, as discussed in chapter 2. Given a question,
the hypotheses are possible answers for which a truth
value c.q. likelihood has to be determined.
In RS and GIS applications we have to solve classific-
ation problems and problems of estimating the best
parameters of radiometric and geometric models. Con-
centrating for this moment on the estimation >f likelihood
for class membership : the hypotheses are about the
class an object belongs to and the evidence is derived
from the data, so P(H|E) -» P(Cl|x). ( In a similar way
parameters of say a metric model can be estimated :
P(parameter|x) under a minimum cost / max likelihood
criterion).
3.2 Models for reasoning.
Forward reasoning : usually one starts with a data vector
x followed by the evaluation of the posterior probability
for each class given the value of the data vector. With
equal cost functions for all classes (cost of misclassific-
ation) the minimum cost classification rule is the same
as the maximum likelihood rule, which is also called the
max. a posteriory rule, the MAP rule. When data are
missing, or when there are too many data, the data
driven approach tends to fail. In such cases it is better to
start with the most probable (a priory) hypothesis and
search for data supporting or negating each of the
hypotheses.
Backward reasoning : starting with a hypothesis ,in the
case of one source of data with occasionally missing
data, the expression P(Class|x1,x2,x3,..) can be evalua-
ted even when only one of the vector elements xi is
missing.
3.3 Nominal, GIS data.
The role of data already stored in a model (say a GIS) is
to provide the best possible prior probability for the
class of the object under consideration, P(Class(time)).
The link between P(Class(time--1)|xi) and P(Class(time))
is defined through the Markov and Bayes relations.
Markov : P(Class(time+1)) = function(Class(time),con-
text).
Bayes : P(Class|x) x P(x)= P(x|Class) x P(Class) .
898
3.4 Missing data.
If the components of x =[x1,x2,...] were independant,
then P(x|Class) = P(x1|Class) x P(x2|Class) x ....
Independancy of data components is one of the aims
of feature extraction from data, but does not solve the
problem of missing data.
The estimation of the interdependancy of xi,x2,....
given class can be done in a non_parametric way
which is optimal in terms of minimum error, or in a
parametric way, which is minimal in terms of efforts for
the human brain.
Within a parametric approach, assuming a Gaussian
distribution of frequency(x1,x2,..,Class=constant), the
parameters of the distribution are mean(x) and cov-
ariance matrix(x). The MAP decision rule is equivalent
to a minimum Mahanalobis distance rule in an
anisotropic measurement (feature) space.
The effect of e.g. x2 missing from [x1,x2,x3] is a
projection of a 3 dimensional cluster onto a 2
dimensional subspace. In order to let the missing data
not influence the likelihood for a class, it is sufficient to
substitute for x2, the mean(x2,class) for every class.
In the above schema the data interdependancy is
taken care of by the covariance matrix while the
missing data gets a default value per class. The
dependancy of the data repair operation on the class
under consideration indicates a backward chaining
mode.
The inference procedure is then :
for all possible classes for the object under consider-
ation do: - look up the prior probability for that object
and that class -» P(Class) - evaluate the data vector x ,
if components are missing then replace them with the
most likely xi ,- max P(xi | Class). - update P(Class |
X), store it in the gGIS.
3.5 Multiple data sources, multiple models.
With multiple data sources the critical part is in the
feature extraction by model inversion. One of the aims
in feature extraction is to get non redundant, statisti-
cally independant clusters for the classes. As in reality
most processes are coupled, it is very likely to find
ı m "^ "m. p p eau