XVIIth ISPRS Congress: XVIIth ISPRS Congress

fritz, lawrence w.; lucas, james r.
  
  
  
  
further reduction to the classification error can be 
achieved by choosing a proper threshold t, and form 
(9) that the metric Q and K also affect to the final bias. 
The effect of the decision threshold can be optimized, 
if the density functions of the underlying problem are 
normal (Fukunaga, p. 329). For more general cases an 
optimal t is impossible to achieve analytically. How- 
ever, the following experimental procedure based on 
empirical error estimation can be carried out. For each 
value of k, find the threshold, which produces the 
smallest experimental upper bound for the error. The 
value of t, which corresponds to the smallest upper 
bound can be chosen together with the corresponding k 
to the final classification. 
Another way to compensate the effect of the bias terms 
is to consider the metric Q and the kernel shape K. 
From (9) and (10) we can see that the terms, which are 
independent of Nj, can be compensated, if o,(x)=0,(x). 
Hence, for each x, one should find a solution to 
Vp, (x) Vp, (x) (11) 
tr z fr . 
m ai en © 
  
  
This is extremely hard to obtain. Only in the case of 
normal distribution a solution can be achieved, and if 
additionally the covariance structures are similar, a 
choice of Q.—X,, is reasonable. In the general case, if Q 
is of second order, an approximate solution can be 
accomplished by choosing (Fukunaga 1990, p. 337) 
Q, - Z, * vx-u9G-uyY , (12) 
where y, should be chosen to be a little larger than 
1 
Tr pe (13) 
(x-p)(x- uj) 
The effect of the metric is demonstrated in figure 1 
(adopted from Fukunaga 1990), which shows how dra- 
matic can the effect be on bigger kernel sizes. 
  
0.5 Error rate 
    
Q=1; +1; (Xu) (Xp) 
0.4 4 
0.3 - 
0214.;, 
NE 
0.1 
  
  
  
T I i 7 1 3 
0.5: 10-19-25 20.025 
k (kernel size) 
Figure 1. The effect of the metric to the estimated error 
rate, data Gaussian with unequal covariance matrices. 
  
  
  
The kernel shape K is again a parameter to be choosed. 
328 
It was experimentally shown in (Fukunaga 1990) that 
the more uniform type the kernel is, the more it will 
affect the lower bound for the error, but have only 
little effect on the upper bound. If Gaussian kernel is 
used, it may happen that too much emphasis is put to 
the centre of the kernel. A uniform kernel, like the 
hyperball or hypercube, is another choice. It puts 
equal emphasis throughout the neighbourhood. It is 
quite evident that this produces many a posterior prob- 
abilities in the border regions to be equal. 
A computation intensive estimate of the Bayes error 
can be achieved with the Parzen method from (10) 
-d 
Ne 14 
Efè} « € + a,r? + ar « aur, da 
After varying k, we can finally perform a least squares 
fit using model (14) and from that have an estimate, Ë, 
of the Bayes error &. Because it is reasonable to 
believe that all the constants in (14) are positive, this 
can be used as a constraint in the estimation procedure. 
The fit should be repeated for each value of t, because 
for (14) to be valid, a Bayesian decision should be 
made in each case. The value of t, which gives the 
lowest estimate of error should finally be chosen. 
2.2.3 Voting kNN classifiers In the well-known 
voting kNN procedure distances to the k nearest design 
samples are computed and the pattern vector is 
addressed to the class, which gets the majority of the 
votes. If ties occur, a reject option can be used like in 
the following analysis. This simple, though computing 
intensive method has become appealing because of the 
following result, which is achieved via the asymptotic 
analysis (as Nye, P(&JX,,,)—P(6,X), see Devivjer 
& Kittler 1982) 
Se sus Sex usi. $206, 05) 
where £y stands for the asymptotic error based on k 
nearest neighbours and € stands for the Bayes error. 
According to this asymptotic result very tight bounds 
for the Bayes error can be achieved by applying a kNN 
classifier, if k is large enough. 
However, in practice we have to cope with finite sized 
design sets and (15) will be heavily affected by the 
corresponding bias. The simplified presumption 
P(GIX,)-P(&JX) does not hold and the corresponding 
risks will be biased. A detailed study of this bias in 
the NN and 2NN cases can be found from (Fukunaga 
& Hummels 1987a). They showed that a second order 
approximation of the bias of the voting NN procedure 
is 
-1 
en 
E(A&,) 7 B,-E,|lQl ¢ r(QB,(x))[ , 
where 
and 
ture 
ord 
whe 
  
  
Figu 
The 
botto 
For 
close 
simil 
prob 
value 
Equa 
(10). 
valid
1
2
...
337
338
339
340
341
...
1000
1001
Full text: XVIIth ISPRS Congress (Part B3)

Access restriction

Copyright

Note to user