International Archives of Photogrammetry and Remote Sensing, Vol. 32, Part 7-4-3 W6, Valladolid, Spain, 3-4 June, 1999

Fig. 3. Paradigm of Bayesian information and knowledge

fusion.

models Mj and M 2 . The paradigm of Bayesian information and

knowledge fusion is presented in Fig. 3.

In the second paradigm, we address mainly the fusion of

information for the physical characterization of scenes, e.g.

estimation of terrain heights derived jointly from image

intensities and SAR interferometric phases (Nicco et al., 1998).

corresponding neighbourhood, Z partition function and T

temperature (Datcu et al., 1998; Geman and Geman, 1984;

Schroder et al., 1998).

Images and other multidimensional signals satisfy the local

requirement that neighbouring sites have related intensities. On

the other hand, a model should also be able to represent long-

range interactions. This is a global requirement. Gibbs random

fields are able to satisfy both requirements.

The equivalence of Gibbs and Markov random fields gives the

appropriate mathematical techniques to treat these behaviours. A

pragmatic problem is to fit optimally a Gibbs random field model

to real data. It was expected that a maximum likelihood estimate

gives the desired result. That is not generally possible due to the

requirement to evaluate the partition function.

Several alternative solutions have been proposed: the coding, and

the maximum pseudo-likelihood. However, none of these is

efficient. Recently, a consistent solution for the maximum

likelihood was introduced: Monte Carlo maximum likelihood.

This algorithm requires no direct evaluation of the partition

function, it is consistent and converges to the maximum

likelihood with probability one (Cowles and Carlin, 1996; Li,

1995).

3. STOCHASTIC MODELS

A “universal” model for stochastic processes is not

mathematically tractable. In applications, we are faced either

with simple case studies, e.g. the data is precisely described by a

low complexity stochastic model, as in laser speckle images, or

the data is of high complexity, and then we use an approximate

model. A challenging research task is to find a “quasi-complete”

family of models for a certain class of signals, for example all

images provided by one sensor.

In this spirit, we concentrate on the following stochastic models:

Gibbs random fields, multidimensional stochastic processes,

cluster based probability modelling.

3.1. Gibbs random fields

In many situations signals satisfy predetermined constraints. We

can restrict the modelling of these signals by considering only

probability measures that fulfil these constraints. Here, we have

to choose the appropriate probability measure: the one satisfying

the set of constraints. Applying a maximum uncertainty

principle, the probability measure that satisfies all relevant

constraints should be the one that maximizes our incertitude

about what is unknown to us. The probability measure p(x)

resulting from such a principle is a Gibbs distribution:

1 J f HM

P(x) = 2 e

Z —

■I

HM = I v clique^*»

allcliques

9 = {a 0 ,a 1 ,...a 4 ,ß 1 ,...ß 4 ,Y 0 }

(7)

where cxq, ...a 4 , (3 1 ,...(3 4 , y 0 are the model parameters associated to

the corresponding cliques, V is the potential function

characterizing the interaction between the samples of the random

field inside the clique, H represents the energy function for the

3.2. Cluster based probability models

Non-parametric modelling allows more flexibility, but results in

more complex algorithms and requires large training datasets.

From the class of non-parametric models, the kernel estimate

plays the last period an important role in signal processing.

Kernel estimation works with the hypothesis of smooth

probability density functions using a generalization of the

training dataset.

N

HX) = ¡j X k(x ~ x „> (*>

n = 1

where p is the probability density function, k a kernel, X is the

data vector. The kernel k(X) radiates probability from each vector

in the learning sample and N is the number of kernels. The

learning sample is generalized.

A combined technique that uses the generalization property of

kernel estimation and the summarizing behaviour of cluster

analysis was proposed. The technique requires clustering of the

training data and fitting separable Gaussians to each of the

resulting regions.

N d

**>= 5> m n>«,№ <*>

m = 1 i = 1

where w the measure of number of points in one cluster and d the

number of centers of action.

Cluster based estimation first finds the centres of action

(clustering). It uses a single kernel in one cell. This method is

successful for treating high dimensional data, is able to capture

high-order non-linear relationships, can be applied in a

multiscale algorithm, and shows a good representation of the tails

of the distribution (Popat and Picard, 1997).

3.3. Stochastic pyramids

The wavelet transformation of images in its operator formalism

suggests the decomposition of the signal into two components: