XVIIIth Congress: XVIIIth Congress

kraus, karl; waldhäusl, peter
gorithm [Hart, 1968]. As adaptative learning algorithms 
we have chosen DSM [Geva & Sitte, 1991] -Decision Sur- 
face Mapping- and LVQ-1 [Kohonen, 1990] -Learning Vec- 
tor Quantization, version 1-. The values of the para- 
meters involved in LVQ-1 learning have been estim- 
ated by using two algorithms proposed by the au- 
thors [Cortijo & Pérez de la Blanca, 1996b]. 
Now we can apply the 1-NN classifier using the reference set 
learned by these algorithms. We will note by 1-NN (74) 
to the 1-NN classifier that uses Tm as reference set, that 
is, the multiedited training set. Following this notation, 
if Tuc is the multiedited-condensed training set, then 1- 
NN (7x c) is the 1-NN classifier that uses Tmc as reference 
set. To apply DSM learning it is required that the training 
set to be previously edited [Geva & Sitte, 1991]. We have 
used Tu as initial set for DSM learning. Now if Tpsm 
is the reference set after DSM learning, 1-NN (7psx) is 
the 1-NN classifier that uses Tpsm as reference set. Fi- 
nally, if Tzvg-1 is the reference set after LVQ-1 learning, 
1-NN (7zvQ-1) is the 1-NN classifier that uses TLv@-1 as 
reference set. More details about these algorithms can be 
found in [Cortijo & Pérez de la Blanca, 1996a]. 
2.2 Contextual Classifiers 
The contextual classifiers we have tested are based in the 
assumption of a Markov random field to model the prior dis- 
tribution of the labels in the image. Stochastic models and 
random fields (RF) in particular represent accurately inform- 
ation a priori on the map. This information can be used in 
such a way that the Bayes decision theory can be applied. 
A random field is a joint probability distribution imposed on 
a set of M random variables L = {L1,..., Lu } represent- 
ing objects of interests that imposes statistical dependence in 
a spatially meaningful way. In contextual classification each 
L; € Q. The spatial dependence can be specified by a global 
model such as the Gibbs random field (GRF). A GRF describes 
the global propertied of an image in terms of the joint distri- 
bution of labels for all pixels [Dubes & Jain, 1989]. A Markov 
random field (MRF) is defined in terms of local properties. 
It is needed to fix a neighborhood system in which the spa- 
tial dependence is relevant. Two neighborhood systems are 
mainly used, the first order neighborhood which includes the 
four-nearest-spatial-neighbors, and the second order neigh- 
borhood which includes the eight-nearest-spatial-neighbors. 
Given a set of observations, X = x, and the contextual in- 
formation modeled as a MRF, P(L — l), in a Bayesian con- 
text the objective is to find the estimator Í which maximizes 
equation 1, that is, the a posteriori probability of L — i given 
> =u 
P(X=z|L=10)P(L=I 
  
PL=1|X=g)= (1) 
This is known as the MAP (maximum a posteriori) method. 
The model relating observation x to labeling / is chosen to 
ensure that the posterior distribution of L, given X — z, is 
also a MRF. If we require conditional independence of the 
observed random variables. given the true labels, it is enough 
to ensure that the posterior distribution is also a MRF. Thus 
we assume that 
PX=c|L=l=][PXi=a|Li=t) (2 
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996 
If both P(X = x | L = 1) and P(L = [) are known we 
can compute L which maximizes the MAP by applying equa- 
tion 1. In the practice it is clear that even if M and J are 
low it is not possible to calculate directly the MAP as given 
in equation 1. To circumvent this problem some alternat- 
ives are available to estimate the MAP [Dubes & Jain, 1989]. 
The first approximation consists in the simulated anneal- 
ing algorithm [Geman & Geman, 1984] which find MAP es- 
timates for all pixels simultaneously. As the computational 
demands of this algorithm are considerable there are two 
computationally feasible approximations to the MAP estim- 
ate: a) the ICM algorithm (iterated conditional modes) 
and b) the MPM algorithm (maximizer of posterior margin- 
als). A detailed discussion on these methods can be found 
in [Dubes & Jain, 1989] and references therein. We will cen- 
ter our interest in the ICM algorithm [Besag, 1986] which 
has been demonstrate to have an excellent trade-off between 
the accuracy of the contextual correction and the required 
computational effort [Cortijo, 1995]. 
Another approximation to contextual correction using a MRF 
consists in point-to-point contextual correction methods. 
They are based in complex conditioned-probability models 
which are extensions of the MAP expression given in equa- 
tion 1 by adding an additional term, the contextual cor- 
rection factor, into the denominator of the MAP expres- 
sion [Sæbg et al., 1985]. Assuming conditional independence 
of the feature vectors (observations) in a spatial neighbor- 
hood two models can be adopted [Saebg et al., 1985]: a) the 
Welch and Salter, Haslett's model and b) the Owen and 
Switzer's model. We have tested both models in this work. 
Contextual classifiers accept as input the classifications ob- 
tained by the 8 spectral classifiers described in section 2.1, 
so we have performed 24 additional classifications for each 
problem. 
3 DATA 
The data used to test the performance of the classifiers are two 
LANDSAT images, landscapes from Greenland, Denmark’. 
The first image is a LANDSAT-2 MSS image of the lga- 
liko region. The second is a LANDSAT-5 TM image of 
the Ymer @ region. Both images are 512 x 512 pixels in 
size. The training sets have been selected by expert geolo- 
gists [Conradsen et al., 1987] and their spectral distribution 
represent different problematics. 
In Igaliko we have five classes to discriminate, the training 
set size is 42796 samples and there is a slight overlapping 
in the the spectral distribution of the training samples. In 
Ymer @ we have twenty classes to discriminate, the train- 
ing set size is 12574 samples and there is a high over- 
lapping in the spectral distribution of the training samples. 
See [Conradsen et al., 1987] for more details. 
In this work we have adopted the test sample estimation to 
measure the accuracy of the classifications. The training set, 
T is splited into two disjoint sets: 7" (learning set) and 7" 
(test set). 7! has been built by selecting randomly 2/3 of the 
available training samples; the remainder are placed into Tt 
We use the learning set to construct the classifier and the test 
set for testing. In tables 1 and 2 we show the learning and 
test set sizes for each dataset. 
  
!We must thank to the IMM (Denmak University of Technology, Lyn- 
gby, Denmark) for providing the LANDSAT images used in this work. 
122 
   
  
  
  
   
  
   
   
   
   
   
   
   
   
   
    
  
  
  
   
   
   
   
  
   
   
   
   
   
   
   
  
   
  
  
  
   
   
  
  
   
   
   
  
   
   
   
  
  
  
  
  
    
   
   
   
   
   
  
    
Tal 
Tal 
In table 3 
formed on 
curacy of t 
We show ii 
used to ge! 
tion, in the 
accuracies 
map by us 
5... DIS 
From tab 
of the spe 
drastically: 
independe 
true for th 
We can c 
gives the 
computati 
computati 
the global 
classificati 
We must 
the combi
1
2
...
143
144
145
146
147
...
1080
1081
Full text: XVIIIth Congress (Part B3)

Access restriction

Copyright

Note to user