International Archives of Photogrammetry and Remote Sensing, Vol. 32, Part 7-4-3 W6, Valladolid, Spain, 3-4 June, 1999
Fig. 3. Paradigm of Bayesian information and knowledge
fusion.
models Mj and M 2 . The paradigm of Bayesian information and
knowledge fusion is presented in Fig. 3.
In the second paradigm, we address mainly the fusion of
information for the physical characterization of scenes, e.g.
estimation of terrain heights derived jointly from image
intensities and SAR interferometric phases (Nicco et al., 1998).
corresponding neighbourhood, Z partition function and T
temperature (Datcu et al., 1998; Geman and Geman, 1984;
Schroder et al., 1998).
Images and other multidimensional signals satisfy the local
requirement that neighbouring sites have related intensities. On
the other hand, a model should also be able to represent long-
range interactions. This is a global requirement. Gibbs random
fields are able to satisfy both requirements.
The equivalence of Gibbs and Markov random fields gives the
appropriate mathematical techniques to treat these behaviours. A
pragmatic problem is to fit optimally a Gibbs random field model
to real data. It was expected that a maximum likelihood estimate
gives the desired result. That is not generally possible due to the
requirement to evaluate the partition function.
Several alternative solutions have been proposed: the coding, and
the maximum pseudo-likelihood. However, none of these is
efficient. Recently, a consistent solution for the maximum
likelihood was introduced: Monte Carlo maximum likelihood.
This algorithm requires no direct evaluation of the partition
function, it is consistent and converges to the maximum
likelihood with probability one (Cowles and Carlin, 1996; Li,
1995).
3. STOCHASTIC MODELS
A “universal” model for stochastic processes is not
mathematically tractable. In applications, we are faced either
with simple case studies, e.g. the data is precisely described by a
low complexity stochastic model, as in laser speckle images, or
the data is of high complexity, and then we use an approximate
model. A challenging research task is to find a “quasi-complete”
family of models for a certain class of signals, for example all
images provided by one sensor.
In this spirit, we concentrate on the following stochastic models:
Gibbs random fields, multidimensional stochastic processes,
cluster based probability modelling.
3.1. Gibbs random fields
In many situations signals satisfy predetermined constraints. We
can restrict the modelling of these signals by considering only
probability measures that fulfil these constraints. Here, we have
to choose the appropriate probability measure: the one satisfying
the set of constraints. Applying a maximum uncertainty
principle, the probability measure that satisfies all relevant
constraints should be the one that maximizes our incertitude
about what is unknown to us. The probability measure p(x)
resulting from such a principle is a Gibbs distribution:
1 J f HM
P(x) = 2 e
Z —
■I
HM = I v clique^*»
allcliques
9 = {a 0 ,a 1 ,...a 4 ,ß 1 ,...ß 4 ,Y 0 }
(7)
where cxq, ...a 4 , (3 1 ,...(3 4 , y 0 are the model parameters associated to
the corresponding cliques, V is the potential function
characterizing the interaction between the samples of the random
field inside the clique, H represents the energy function for the
3.2. Cluster based probability models
Non-parametric modelling allows more flexibility, but results in
more complex algorithms and requires large training datasets.
From the class of non-parametric models, the kernel estimate
plays the last period an important role in signal processing.
Kernel estimation works with the hypothesis of smooth
probability density functions using a generalization of the
training dataset.
N
HX) = ¡j X k(x ~ x „> (*>
n = 1
where p is the probability density function, k a kernel, X is the
data vector. The kernel k(X) radiates probability from each vector
in the learning sample and N is the number of kernels. The
learning sample is generalized.
A combined technique that uses the generalization property of
kernel estimation and the summarizing behaviour of cluster
analysis was proposed. The technique requires clustering of the
training data and fitting separable Gaussians to each of the
resulting regions.
N d
**>= 5> m n>«,№ <*>
m = 1 i = 1
where w the measure of number of points in one cluster and d the
number of centers of action.
Cluster based estimation first finds the centres of action
(clustering). It uses a single kernel in one cell. This method is
successful for treating high dimensional data, is able to capture
high-order non-linear relationships, can be applied in a
multiscale algorithm, and shows a good representation of the tails
of the distribution (Popat and Picard, 1997).
3.3. Stochastic pyramids
The wavelet transformation of images in its operator formalism
suggests the decomposition of the signal into two components: