In: Wagner W., Székely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, ¡APRS, Vol. XXXVIII, Part 7B
215
proper energy function. In the following, we will first introduce
the method adopted for modelling both the class-conditional
densities and the joint prior probability; then, we will present
the iterative strategy for identifying the sets of labels C', C 1
maximizing the likelihood H{X\X 2 \C',C 2 ).
3.1 Conditional Density Modelling
Computing p(X‘\C'), t-1,2, requires at the two considered
dates the estimation of the class-conditional densities
p{s!ij\oi m ) and pix'ij \ccU), Vx^eA" . The proposed approach
is based on the observation that the PDF of each pixel can be
modelled as a mixture of the conditional PDFs of both the inter
est and unknown classes:
3.1.1 Estimation of pix’ij)
For computing both the centres M'={//*}*=i and variances
X' = {cr 2 *}Li of all the kernels, as well as the set of weights W'
defining p(x'ij), we employ the EM algorithm over all the pix
els of X‘. EM allows determining the maximum likelihood
(ML) estimator of the parameters characterizing a certain distri
bution in the presence of incomplete observations (Dempster et
al., 1977). Indeed, our objective is to identify the ML estimate
for the set of parameters 6' ={M',E',W'} = {//*,(J 2 '*,w*}Li that
allows maximizing the log-likelihood of X', i.e.
In £(#') = In [p(X' 16?' )] = In [p(x'ij )] (8)
>j=i
p{x\j) = P(coL<)- P (iH,\aL)+P(aUyp(ji\ajU) (3)
According with Parzen density estimation (Duda et al., 2000),
we aim at obtaining a reliable nonparametric estimate for
p(x'j) as a mixture of a suitable set of kernel functions
<D' = ¥ (')}*=>:
pix'ij)=Ÿ j W k ■ (plix'ij) (4)
*=1
At each iteration /, the set of estimated parameters pro
vides an increase in the log-likelihood until a local maximum is
reached, i.e. ln£([#'] <,, )>ln£([#'] (,) ).
For simplicity, weights are initially set to \lK , whereas kernel
parameters are initialized using the ft-means clustering algo
rithm (Bruzzone and Femandez-Prieto, 1999) fixing the number
of clusters equal to K . In particular, centres and variances of
the Gaussians are initialized to the centres and variances of the
resulting clusters. Then, according with Dempster et ah, 1977,
the updated estimates for the unknown parameters are given by:
where K denotes the number of kernels (a free parameter to be
set by the user), and W ={w*}*=i represent the weights regu
lating the contribution of each kernel.
However, since the density pix'ij) is given by a linear combi
nation of pix!j\aLi) and pix'j \aU), it is worth noting that, if
the set of kernels O' allows obtaining a reliable estimate
pix'ij), then also both the class-conditional densities can be
reliably approximated as a linear combination of O'. Hence,
they can be estimated as:
£
[wif =iLL
u - N] (M) W4)] (<
[M)T
IJ
(9)
n( o_è [¿(xjor
[/¿] -~b
[¿(4)] (M)
(10)
K
p(Xij | OXax ) — ^ ^ Wmtfc * (f)k (Xij ) (5)
k=1
K
p(Xij [ Ojlmk ) — ^ ^ Wunk k *(f)k(Xij')
k=\
1
[<rt]
,(/) _ ¡.M
[wi;rwx')] ( '
[^(4)] (M)
' k [ J p(xi)] (,1)
(H)
where >U't ={u4t JL , yUL ={wLk*}L are the weights regu
lating the contribution of each kernel for the interest and un
known classes, respectively.
As commonly done in the literature we consider normalized
isotropic Gaussian kernels, i.e.
(plix'ij)
C4lK0 1 \.
-exp
4-/4 H 2 ~
2cr 2 * .
(7)
where /¿i is the centre and o\ is the variance (which tunes the
smoothness of the estimate).
In the following, we describe into details the procedures
adopted for estimating pix'j), p(x',j |&4) and pix'ij \coLk),
respectively.
Reasonably, we assume that convergence is reached when the
relative increase in the log-likelihood is lower than a prefixed
threshold e .
3.1.2 Estimation of /?(xi, |ryfnt)
Once M' and X' have been determined (and hence the set of
kernels O' properly defined), we exploit the available ground
truth for the class of interest abi at each date for deriving the
estimate of the corresponding conditional density pix'ij \ aim).
In particular, the set of weights Wn't associated with ai M is
determined using again the EM algorithm, but solely on the
available training samples T' = (x!> e X' \ yij = aim}, | T' |= N‘,
where denotes the true label for pixel at position (i,j) .
Weights are initialized to 1 /K , and then updated (according
with Dempster et al., 1977) using the following equation: