Oo
0.150
0.125 À
0.100
0.075
0.050
0.025
ri T T ım
35 10 20 50
A - AResubstitution X---X Leave-one-out &-—A Ordered sets
© - © Holdout D—ElRisk averaging R
Q--OBootstrap-b632 — W—WRisk averaging H
Figure 4. Standard deviations of the different error
estimators, data II, m=d*N,.
The results mostly agree with what was expected. The
variances of the risk averaging methods are smaller
than the variances of the error counting methods, espe-
cially in small sample size situations. This result is
clearly in favour of the risk averaging in small sample
situations, where the variance term mostly dominates.
The upper bounds in both cases have a bigger variance
than the lower bounds as expected. The use of the
error reject tradeoff is comparable with risk averaging,
but the difference between the upper and lower bounds
(not shown in figures) is extremely big in small sample
size situations. E.g. in dataset NN for the two smallest
sample sizes the upper and lower bounds are 0.17 vs.
0.35 and 0.17 vs. 0.28, respectively. The convergence
to the asymptotic case is slower than could be
expected, which might come from the constant bias
term of (20). However the mean of these bounds quite
well predicts the error rate and the variance is nearly
comparable to that of risk averaging.
€
0.304
0.254
0.20-
0.154
0.10
Ax - AResubstitution X----X Leave-one-out A-—À Ordered sets
® - 6 Holdout D—ElRisk averaging R
Q--GOBootstrap-b632 — WI—WRisk averaging H
Figure 5. Estimated error rates of different estimators,
dataset NN, dashed line - Bayes error, Ny-m*d.
One comment concerning the nonparametric results.
The tuning of the kernel size of the nonparametric
classifier was done in too rough a quantization in case
of small sample sizes. That is the probable reason why
some of the curves (e.g. resubstitution curve) do not
behave smoothly. The bias term still dominates in
some cases.
332
Oo
0.154
0.104
0.054
rT T T | m
35 10 20 50
Ax - A Resubstitution X---X Leave-one-out &-—À Ordered sets
© - 6 Holdout [—ElRisk averaging R
Q--OBootstrap-b632 — W—lWRisk averaging H
Figure 6. Standard deviations of different estimators,
Np=m*d.
In table 2 the different types of error estimation
methods are compared with each other as a function of
the Bayes error. All results are average values from
the lower and upper bounds of the estimated Bayes
error (e.g. error counting = Y[leave-one-out+resubstitu-
tion]). The following can be observed from the table.
a) Risk averaging methods have dominantly smaller
variance than the traditional methods. The difference
is the bigger the smaller is the Bayes error and the
smaller is the sample size. b) As has been illustrated
in many simulations, bootstrapping method does not
perform well in low error rate situations. c) The risk
averaging methods and the method using error reject
tradeoff are pessimistically biased in low error rate
situations. These methods (also bootstrapping) works
better under such circumstances, if the lower bound
only is used (e.g. the lower bound for risk-averaging in
the 0.01 error rate case with 5*d samples equals to the
correct value 0.01, but the upper bound claims 0.02).
The effect could be corrected by using a leave-one-out
type procedure for the upper bound. d) The traditional
Method Bayes error m=5 m=10
£ & 6 & 6
Error counting 25.1 25.6 10.6 25.4 6.6
Bootstrapping 27.3 9.9 26.2 6.5
Risk averag. 23.8 6.0 23.1 4.3
Error reject 26.1 7.8 24.7 6.6
Error counting 10.0 9.7 6.3 10.4 4.8
Bootstrapping 11.8 6.7 10.6 5.2
Risk averag. 9.9 3.0 9.8 2.2
Error reject 14.5 5.9 11.8 42
Error counting 5.0 49 5.0 5.0 33
Bootstrapping 6.4 48 5.5 2.9
Risk aver. 5522 49 1.5
Error reject 75 33 5.8 2.6
Error counting 1.0 09 23 1.0 1.6
Bootstrapping 1.6 23 1.0 14
Risk aver. 15:14 1.2 0.6
Error reject 17:12 14 0.8
Table 2. Comparison of error counting methods as a function
of the separability between classes, all numbers in percentages, m
stands for sample size (m*d=N,), d=8, linear classifier.
metl
enou
time
Parc
Tabl
parai
has
class
data
a Su
presi
form
Tabl
optim
classi
In t
class
with
the .
nonp
the c
the t
estin
then
boun
givei
are [
equa
Tabl
datas
liers |
How
If th
meth
table
mate
uppe
Robi
In t