anbul 2004
1991). The
likelihood
fication by
nes. These
ted to the
ntire scene
ost alike. In
on of the
be normal
y, Is drawn
robability to
Lo
class xs
assign one
en given as
(2)
by a "salt
on of some
then leads
f remotely
ated, both
of energy
ally occur
f a pixel.
efficiency
Contextual
ie. When a
incomplete
vever, the
complete
kinds of
| temporal
hy in non
s of two
le, if (i. 7)
belongs to
n, n) also
r a pixel is
also on all
4, j). Non
MAP and
on MAP
H
to a site
f, but also
| is to find
d on the
) assigned
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B7. Istanbul 2004
into one of M classes; that is, x, = Uu. 2... M where M is the
number of classes assumed to be known in supervised
classification process. This optimisation is executed from the view
point of the maximum a posterior (MAP) estimation as follows:
Am = argmax P(X]Y)| ©
NeQ
Where Q is labelled configurations set. Following Bayes theorem,
equation (3) becomes:
JP(Y/X )P(X) (4)
= argmax | P(Y)
XeQ
X
MAP
The modelling of both class conditional distribution P(Y/X) and
prior distribution P(X) becomes an essential task (Khedam ef al.,
2001). P(Y) is the probability distribution of the observed data and
doesn't depend on the labelling X. Note that the estimate (4)
becomes the pixel-wise non—contextual (punctual) classifier if the
prior probability doesn't have any consequence in formulating (4).
P(Y/X) is the conditional probability distribution of the observation
Y given the labelling X. A commonly used model for P(Y/X) is that
the feature vector observed Y, is drawn from a "gaussian
distribution". For a Markov Random Field (MRF) of field
labelled X and so, according to the Hammerslay-Clifford theory,
P(X) can be expressed as a Gibbs distribution with “Potts model”
as energy function model. The global MAP estimate given by
equation (3) is equivalent to the minimisation of the followed a
posterior global energy function (khedam ef al., 2002):
Xuw=argmin{ U (X/Y) } (5)
Xe U
Where:
UV) Si E. dd =
se.
I
+38 3 (1-8 (x30) ©
q€Q reVs
Where x and S are statistic parameters of class xs
ph.
estimated during training step process and B is parameter
regularisation and is frequently user specified. @ is Kroeneker
function calculated on the neighbourhood Vs of site s on all clique
q from set clique Q (figure 1). Neighbourhood system Vs can be
4-connexivity given by equation 7 or 8-connexivity given by
equation 8.
Viel «pe s. o«i-Xy«G-Ysi j à
2
t D e S, o«i- by G-Iys2 ] €
4-Connexivi Order ! 9 i)
Oo
Pe: Order 2 1
2e
o (k, 1)
8-Connexivity !
o © 0 Order 3 c ...
O @ o
Order 4 LL) ...
OQ Oo o
Figure 1. Neighbourhood and cliques
Once MAP classification problem is formulated as an energy
minimisation problem, it can be solved by an optimisation
algorithm. Among the most effective algorithms for optimisation in
the framework of image MRF modelling are Simulated Annealing
(SA) (Geman et al., 1984) whose the computational demands are
well known and Iterated Conditional Modes (ICM) (Besag, 1986)
which is a computationally feasible alternative of the SA with a
local minimum convergence of the energy function. To use ICM
algorithm, global minimisation energy function of equation (6)
must be transformed on the followed local minimisation energy
function under the assumption of the independence of conditional
probabilities:
U (x.J/v» is do -i Xs y. S ; (y.-i Xs ) nd Is
ij
+3 3 (7-3 (x«,x))
re Vs
The first term on the right hand of (9) called data attach term, can
be consider as the potential function of one-order clique and is
often used to provide an initial configuration for the contextual
classification process. The second term called regularisation term,
defines pair-cliques potential function which explicitly describes
local spatial interactions in neighbourhood Vs . ICM flow chart is
presented on figure 2.It can be resumed on five steps as follows:
Step 1: Estimate statistic parameters set ( 4« , 22s ) from the
training samples of each class from classes set A.
Step 2: Based on ( xs , Zs ), estimate an initial classification using
the non-contextual pixe-wise maximum likelihood decision rule.
We use first term of equation (9).
Step 3: Choose an appropriate value of D, an appropriate shape
and size of neighbourhood system Vs and an appropriate
convergence criterion.
Step 4: Perform the local minimisation defined by equation (9) at
each pixel in specified order: update y, by the class x, that
minimises equation (9).
Step 5: Repeat step (3) until convergence. Iterative algorithms
often pose convergence problem. Convergence criterion which
we have adopted in this study is a zero number of pixels changing
classes between two consecutive iterations. This number of pixels
is calculated on the whole image and thus for all classes. We