Full text: XVIIth ISPRS Congress (Part B3)

erated 
times 
. final 
l esti- 
| from 
empi- 
good 
te. In 
e two 
332 to 
1990, 
iethod 
ne-out 
r, and 
iethod 
) the 
when 
itions. 
0.632 
e NN 
lly to 
m the 
esting 
large 
ethod 
cause 
e test 
aver- 
d the 
esti- 
(22) 
which 
times 
f this 
with 
cially 
risks 
algo- 
r 10) 
f the 
(23) 
nting 
(compare to (1)). 
3.2.1 Direct ("resubstitution") estimate A direct 
estimate will be formed by directly evaluating formula 
(22) for each test sample. Because of the design 
phase, the bias and variance of the estimate € must be 
taken into account. This will effect the final estimate 
to be optimistically biased and the direct estimate thus 
gives a lower bound for the error. 
3.2.2 Reference set ("holdout") estimate For 
getting an upper bound (and thus an estimate as the 
average) we must have another labelled set, which is 
now called the reference set. When this estimate is 
computed, the classification is taken from the design 
set, but the risk is computed with the help of the refer- 
ence set. It is shown in (Devivjer & Kittler 1982) that 
this estimate is asymptotically unbiased and has a 
variance, which is in the order of (23). 
3.3 Methods based on error-reject tradeoff 
If a reject option is included to the classification sys- 
tem, the error rate of the classifier reduces, because the 
reject option discards those pattern vectors, whose 
classification has a high risk. The dependence between 
the reject and error rates is (see e.g. Devivjer & Kittler 
1982) 
A, 
e(à,) = - [A-dR(A), (24) 
0 
where R is the reject rate and A, is the rejection thresh- 
old (if risk is greater than A, reject decision is made). 
Thus observing the rejection rate and varying A, an 
estimate of £ can be computed. 
These methods have all the advantages of the methods 
of the previous chapter, especially the one that an 
unlabelled test set can be used. 
3.3.1 A method utilizing ordered sets We have 
used a special version of error-reject tradeoffs, which is 
specially designed for the voting knn procedures. It is 
shown in (Devivjer & Kittler 1982, chapter 3) by 
asymptotic analysis that the following bounds can be 
formed for the error rate 
yom Ay, Yu Ay, «dA 5 (25) 
2i-1 Mel 2 Aa > 
where A,, Means the acceptance rate of kNN classifier 
with exactly s=i votes. (25) inherently measures the 
asymptotic probability of ties, which occur in 2iNN 
classifications and takes a series expansion of all of 
them. 
Unfortunately the bias produced by the finite design 
331 
sets (see  (20)) is heavily deteriorating the 
asymptotically tight bounds of (25). However, it is 
hoped that the average of the upper and lower bounds 
will produce good results. 
4. SIMULATION RESULTS 
An extensive simulation work has been carried out. 
The primary goals of these simulations was: a) to com- 
pare risk averaging methods with error counting 
methods, b) to compare nonparametric and parametric 
classifiers, c) to test the robustness of the different 
error estimation methods and d) to get at least an idea 
about a classification system, which is general and 
gives as reliable error estimates as possible in all cir- 
cumstances. Although the simulation study is exten- 
sive, it is still limited in many respects. This is un- 
avoidable, because of the many parameters to be tested. 
The generated datasets are listed on table 1. 
  
  
Data ID Optimal Bayes error rate 
Classifier 
II Linear 0.251 
14 Quadratic 0.324 
NN Nonparametric 0.128 
  
  
  
  
  
Table 1. The three types of data used in the simulations. 
The datasets II and I4 are generated from normal dis- 
tribution and the dataset NN from a mixture normal 
distribution. Unless otherwise stated the given Bayes 
error rates are the ones listed in table 1. The Bayes 
error rate of 0.128 for dataset NN represent the case 
when the generated dataset differs maximally from 
normal distribution. All simulations carried out are 
two class cases. 
Risk averaging methods vs. error counting 
As an example, the estimated error rates and standard 
deviations of the different error estimators for datasets 
II and NN in case of a two dimensional feature space 
are presented in Figures 3-6. 
  
  
  
  
  
  
€ 
0.30- 
0.254 
0.204 
m T T €—— 
35 10 20 50 
Ax - AResubstitution XX Leave-one-out &-—4A Ordered sets 
9 - 6 Holdout [3—ElRisk averaging R 
Q--OBootstrap-b632 — B—WRisk averaging H 
  
  
  
Figure 3. The estimated error rates of different estima- 
tors. Data II, Bayes error equals the dashed line, 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.