XVIIth ISPRS Congress: XVIIth ISPRS Congress

fritz, lawrence w.; lucas, james r.
  
  
  
  
> 
RS an 
D 
  
where M is the set of misclassified samples. The 
methods differ from each other from the way they use 
the available sample set. 
3.1.1 Resubstitution In the resubstitution method 
the whole sample set is used for designing and again 
for testing the classifier. This method is known to be 
optimistically biased (e.g. Devivjer & Kittler, chapter 
10), thus giving a lower bound for the Bayes error (or 
for the asymptotic error). However the bias reduces 
then the sample size increases. Thus, if enough 
samples is in use, resubstitution method can be used. 
3.1.2 Hold-out The holdout procedure is a special 
case of cross-validation, where the available sample set 
is divided into exactly two sets. The other set is used 
for designing the classifier and the other set as a test 
set. The method is known to give an upper bound for 
the Bayes error. Its variance is a little bigger than that 
of resubstitution. What makes it rather complicated 
from the practical point of view, is the demand for 
independent test and design sets. The division is far 
from a non trivial problem. The neglectance of some 
part of the available data from designing the classifier 
is not very pleasing, too. An arithmetic average of the 
resubstitution and holdout methods should be used as 
the final estimate of the error rate. 
3.1.3 Leave-one-out The other extreme of cross- 
validation is given then each of the samples in turn is 
used for testing the classifier and the N-1 remaining 
samples are used as a design set. The available 
samples are thus more effectively utilized. Also the 
design and test sets are statistically independent. In 
case of independent sample vectors, the method is 
known to be practically unbiased (Efron 1983). For 
long time the method has been recommended to be 
used in the context of small sample sizes. Unfortu- 
nately, especially for small sample sets, the variance of 
the leave-one-out method is high. Because the vari- 
ance component dominates in small sample sets (Efron 
1983), a low variance estimate is recommendable in 
such cases. Another disadvantage of leave-one-out 
method is the high computational cost for some types 
of classification algorithms, because Nj different 
classifiers must be designed. Fortunately for many 
cases, recursive methods can be used to get the leave- 
one-out estimate practically with the same computation 
time as the resubstitution estimate (e.g. Fukunaga, 
1990, chapters 5 and 7). 
3.1.4 Bootsrapping A bootstrap design sample of 
exactly size Ny is generated by sampling with replace- 
ment. Two different estimates can be computed. In 
the bootstrap resubstitution estimate, the samples of 
the design set are used also for testing. The bootstrap 
hold-out estimate is achieved by using those samples 
330 
for testing, which do not belong to the generated 
design set. The procedure is repeated 100-200 times 
(100 in the simulations of chapter 4) and the final 
estimate is the arithmetic mean of the individual esti- 
mates. Many modifications have been developed from 
this basic algorithm. The one, which in various empi- 
rical studies (e.g. Jain et al. 1987) has shown good 
performance, is called the 0.632 bootstrap estimate. In 
this heuristic method a weighted average of the two 
bootstrap estimates is computed given weight 0.632 to 
the holdout estimate. Theoretically (Fukunaga 1990, 
chapter 5) the bias of the original bootstrap method 
should coincide with the bias of the leave-one-out 
method. However, the variance should be smaller, and 
close to the resubstitution estimate. 
Some hints of the weakness of the bootstrap method 
have been reported. In (Chernick et al. 1985) the 
0.632 bootstrap estimate showed weaker results when 
used with linear classifiers in high error rate situations. 
In (Jain et al. 1987) it was reported that the 0.632 
estimator should not be used together with the NN 
classifier. 
3.2 Methods based on risk averaging 
The risk averaging methods are designed especially to 
have a low variance. A big advantage comes from the 
fact that an unlabelled test set can be used for testing 
the classifier. That is why the test set can be large 
(e.g. the whole satellite scene) and thus the method 
should be suitable for low-error rate cases. Because 
the variance of the error comes primarily from the test 
set, this already produces a low variance. In risk aver- 
aging the risk for each decision is estimated and the 
final error estimate is the average of all of the esti- 
mates. Thus 
Ny 
3.6 (22) 
  
where £, is the estimated risk for decision i, which 
equals to 1-max[P(oIx;)]. Estimate (22) is sometimes 
called the grouped estimate. The performance of this 
estimate has not been fully studied in comparison with 
the error counting. Note that the method is especially 
suitable for nonparametric classifiers, where the risks 
are always computed. No modifications for the algo- 
rithms are needed. 
It is shown in (Devivjer & Kittler 1982, chapter 10) 
that the estimate is unbiased and the variance of the 
method is (in case of equal prior probabilities) 
«S010. LS 
; 23 
N, 2N, on 
Var( é, } 
which is less than half of the variance of error counting 
(com 
3.2 
estin 
(22) 
phas 
taker 
to be 
give: 
3.2 
gettii 
aver: 
now 
comp 
set, I 
ence 
this 
varia 
33 
If a 
tem, 
rejec 
class 
the ri 
1982 
where 
old (i 
Thus 
estim 
These 
of th 
unlab 
3.3. 
used : 
speci: 
show: 
asym) 
forme 
where 
with 
asym] 
classi: 
them. 
Unfor
1
2
...
339
340
341
342
343
...
1000
1001
Full text: XVIIth ISPRS Congress (Part B3)

Access restriction

Copyright

Note to user