XXII ISPRS Congress 2012: Technical Commission VII

a factor graph with three variables z1, 2,23 and two function 
nodes fi and f» with factorization: g(z1, x2, x3) = fi(z1, x2) * 
f2(x2, x3). 
  
Figure 1: An illustrative example of simple factor graph with 
three variables x1, x2, x3 and two function nodes fi (x1, x2) and 
fa(x2, x3). 
The task of classification consists of determining the probability 
of a particular hypothesis given some observed evidence. This is 
solved by calculation of marginal probability of a latent variable, 
or by calculating of the maximum likelihood probability (max- 
imum likelihood on the configured factor graph given the evi- 
dence). The likelihood of the evidence (the features vector) can 
be written as follows: 
KÖN 
p(æ,c|sæ) = | ] ] | P(æn|cr)Pler|s), (1) 
k=1n=1 
where x is the input evidence; s, is the class; cy is the features 
mixture (a latent variable); K is the number of classes; NN is the 
number of features in the input evidence vector. Here, the feature 
is assumed to be a spectral band value. 
The following factor graph structure can be defined for formula 
(1): 
  
Figure 2: Factor graph model (independent model; a mixture la- 
tent variable is employed) for hyperspectral data classification 
The factor graph is described as: 
N N 
g(11,13,... TN, Ck, 8.) — Il £n tn) Il fara ck) Q) 
n=l 
fs Ck, 8512: (31), 
n=1 
where x, is the n-th input feature; ci is the features mixture (a 
latent variable) for the k-th class; s; is the k-th class number; 
Z1,-- - , ZN, Zs are normalizing factors in the graph model [Frey 
and Jojic, 2005] (define prior probabilities); f1,..., fw are the 
factors mapping the input features to the feature mixture; fc—s iS 
the factor bridging the latent variable to the semantic class sg. 
The structure of a factor graph defines a dependency of class vari- 
able node on input features. Use of training data allows to calcu- 
late a configuration (parameter set 0(k)) for a factor graph for 
each class k. 
A configured factor graph with the configuration 0(k) is expected 
to have a maximum likelihood probability (a low energy state) on 
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
138 
   
the evidence which most likely (similar) to the employed training 
data. Expectation maximization method with gradient ascent op- 
timizer are employed for learning the graph configuration. Mean 
field inference method [Frey and Jojic, 2005] is employed for the 
inference. Comparison of the classes probability maps (maxi- 
mum principle) allows to produce label map. 
2.4 Employed data 
Salinas hyperspectral data benchmark (AVIRIS sensor over Sali- 
nas Valley, California; 3.7 m pixel size) was selected for classi- 
fication accuracy evaluation. The data cube size is 512 lines by 
217 samples, 224 bands. 19 water absorption bands were dis- 
carded (bands [108-112] and [154-167]) This image is available 
as at-sensor radiance data. Ground truth classes and the number 
of samples are given in Table 1. 
Minimum noise fraction (MNF) [Green et al., 1988, Boardman 
and Kruse, 194] was employed to reduce the number of input 
features, reduce computational time, and separate noise from the 
data. The MNF consists of two Principal Components (PC) trans- 
formations. The first PC transformation decorrelates and rescales 
noise in the data, the second PC transformation performed on the 
noise-cleared data. 
Since factor graphs are discrete graphical models, the feature is to 
be represented on a predefined finite domain (alphabet). The fi- 
nite domain refers to the unique values (or a list of values) the fea- 
ture can have. Here we use a finite domain consisting of natural 
numbers. To represent an input feature on the finite domain, the 
feature is proposed to be processed separately by an unsupervised 
clustering. A cluster's number is assumed as the value from the 
defined domain. k-means procedure is employed and the number 
of clusters for feature representation on finite set was equal to 100 
(used for representation of all features). In an additional experi- 
ment, the features before representation are processed by median 
filtering. 
20 points were randomly selected for each class in order to con- 
figure the factor graph. Expectation maximization with a gradient 
ascent method were employed for the factor graph configuration. 
Table 1: Salinas hyperspectral benchmark classes (available 
ground truth samples) 
  
| Number | Class 
| Samples | 
  
  
3 RESULTS AND DISCUSSION 
Table 2 presents the overall accuracy and Kappa coefficient for 
the classification. The additional experiment with feature median 
filtering allowed to obtain a better classification accuracy results 
(overall accuracy=85.3217 and Kappa=0.8358 versus 81.3692 and 
0.7921, respectively; compare confusion Tables 3 and 4). Most 
of the classes were labeled with an accuracy more than 90%,
1
2
...
145
146
147
148
149
...
560
561
Full text: Technical Commission VII (B7)

Access restriction

Copyright

Note to user