XXII ISPRS Congress 2012: Technical Commission VII

    
te 
1) 
S 
oO 5 0 
    
fios S iE 
a=l 
ng(x) (6) 
+ 2 le (x) mlx, )] 
where | i, (X,,) = indicator transform of the class pertaining to 
the training sample at x, 
p,(Xg) - posterior probabilities or memberships 
pertaining to all the supports x 5 , i.e., all the pixels 
m(x4),m(xg) = mean values of /, (x) and P, (x) at 
x, and xg , respectively 
Aa (X),Ag(x) = weights of primary variable and 
secondary variable, respectively 
Obviously, it is the combination of the primary and secondary 
information that helps improve the directly predicted precision. 
2.3 The Upper Bound of Accuracy: Arif Index 
Eq.5 and Eq.6 are the mathematical bases for the fusion of input 
information (e.g. spectral information) and spatial information. 
The common ground of the two equations is that both the input 
information and spatial information depend only on the training 
samples. In other words, the differences between the indicator 
vectors and the posterior probability vectors of training samples 
are the premise on which the method proposed in this paper will 
function. The information loss due to classifier consumption 
will result in the residuals. It is easy to encounter this 
circumstance in remotely sensed image classification, so the 
premise on which to apply the methods proposed in this paper is 
prone to be satisfied. 
Even if an ideal classifier exists, the predicted land cover types 
are probably different with the ground truth due to insufficient 
input vectors. In other words, a maximum achievable accuracy 
exists in pattern classification using a particular set of features. 
Hence, if an upper bound of the discrimination power of input 
vectors can be assessed, the difference between the upper bound 
and the predicted accuracy of a certain classifier may be 
regarded as a quantitative measurement of the information 
wasted by the classifier. Furthermore, the quantitative 
measurement manifests the existence of differences between 
true land cover types and the posterior probabilities which 
ascertain the premise of the applications of Eq.5 and Eq.6. 
Arif index, adopted in this paper, can be used to directly assess 
the maximum achievable classification accuracy of a set of 
input features by any classifiers (Arif, 2009). This index varies 
from 0 to 1, with 0 representing completely separable classes 
while 1 representing completely overlapping classes. In other 
words, as overlapping among classes increases, the value of 
Arif index also increases and the classification accuracy 
decreases. 
In Figure 1, the parameter N denotes the volume of training 
samples, and the parameter 0 (0 » 1) denotes a user defined 
threshold which controls the strength of clustering data points 
of the same class near a particular data point y. The Arif index 
is defined as 
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
N 
Arif Index = E = 5) N (7) 
k=1 
  
A mp parameters N, 0 ee 
v 
Normalize feature vectors (means=0, variances=1) 
Y 
Initialize a N x1 status vector S with 
  
  
  
  
  
  
  
zeroes and set a count variable C to zero 
Y 
In the set y e class i, find the nearest neighbor nd 
  
  
(nd € i) of point y and record the shortest distance 
  
as O . The variable C mounts with one increment 
Y 
Find all the points in the set x € i whose distance are 
  
  
  
P| less than Ó from point y and record it as nn(k), where k 
  
  
represents the kth element in the status viable S 
Y 
Car 
Y 
  
  
The data sets fy, nn) are clustered of the same 
class i with the corresponding elements in the 
status variable S { S(C), S(K) } set to 1 
N 
Figure 1. Flow Chart for Calculating Arif Index 
  
  
  
  
  
Obviously, Arif index gives the ratio of data points which are 
not surrounded by data points of its class to the total number of 
data points (Arif, 2009). The relationship between Arif index 
and the maximum accuracy a feature set may achieve is 
computed as 
Maximum Accuracy = 100 — 
( 00 — AccuracYjower _ bound )x Arif Index ®) 
where AccuraCYıower bound = lower bound of accuracy which 
is the percentage of the majority class. 
Hence, a linear trend can be interpolated between 100% 
classification accuracy and the lower bound of the classification 
accuracy.
1
2
...
256
257
258
259
260
...
560
561
Full text: Technical Commission VII (B7)

Access restriction

Copyright

Note to user