XVIIth ISPRS Congress: XVIIth ISPRS Congress

fritz, lawrence w.; lucas, james r.
  
  
General classification system 
According to the above results it seems possible to use 
the nonparametric classification method combined with 
the risk averaging methods in favour of a general clas- 
sification system. 
CONCLUSIONS 
Empirical error estimation has been studied by simula- 
tion. The comparison was made between traditional 
error counting, risk averaging and a method utilizing 
the error-reject tradeoff. The error counting methods 
included resubstitution, holdout, leave-one-out and the 
bootstrapping method. 
There are three reasons, why the risk averaging 
methods are recommended. First, it was confirmed 
that the variance of the risk averaging methods is 
superior to that of error counting. Secondly, because 
an unlabelled test set can be used, the method is econ- 
omical and can always utilize a lot of test samples. 
Thirdly, and most importantly, the method is extremely 
robust against outliers, especially in the context of 
nonparametric classification. The use of the error 
reject tradeoff is also appealing because of the same 
reasons, but more research is needed to test it. On the 
other hand the MacLaurin expansion in the derivation 
of the elegant voting kNN modification is unfortunately 
not easily expandable to the multiclass case. This is in 
favour of the risk averaging. The bias of the direct 
risk averaging method causes it to act as a lower 
bound. This must be compensated somehow. The 
upper bound used in this project is based on the use of 
a holdout type estimate via a reference set. This 
method is not feasible from the practical point of view, 
because the method ignores half of the learning 
samples from the design. In this respect the usage of 
error reject tradeoff is more viable. Unfortunately its 
sample based upper and lower bounds are not at all 
tight and do not converge to the asymptotic case unless 
the kernel size, k>æ. A leave-one-out type of an 
estimator for the upper bound of the risk averaging 
could be economically established in the context of 
nonparametric estimation, because the design set uses 
only a local neighbourhood. It is hoped that in this 
case the upper bound behaves more nicely. 
The simulations confirmed that a nonparametric classi- 
fier, if it is properly tuned, can perform as well as a 
parametric one, even in the case the prior information 
favours a simple linear classifier. 
The primary goal of this simulation study was to test, 
if a general classification system (performing well in 
most cases) can be found so that a designer does not 
have to choose from so many different possibilities. 
The recommended system consists of: a) A nonpara- 
metric classifier, preferably a Parzen classifier, because 
it is easier to tune. It is of utmost importance to 
optimize all the parameters of a Parzen classifier. This 
concerns both the kernel shape and size, and the deci- 
334 
sion threshold. This optimization can be done via 
empirical error estimation. The error estimation should 
be done via risk averaging methods, which have a low 
variance and are robust against outlying observations, 
especially when nonparametric methods are used. 
The extension of these results to multiclass cases is a 
demand for future research. 
REFERENCES 
+ Chernick, M., Murthy, V., Nealy, C., 1985: Applica- 
tion of the Bootstrap and Other Resampling Tech- 
niques: Evaluation of Classifier Performance. Pat- 
tern Recognition Letters, vol. 3, pp. 167-178. 
+ Devivjer, P., Kittler, J., 1982: Pattern Recognition : 
A Statistical Approach. Prentice Hall. 
+ Efron, B., 1983: Estimating the Error Rate of a Pre- 
diction Rule. Journal of American Statistical Asso- 
ciation, vol 78, pp. 316-333. 
+ Fukunaga, K., 1990: Introduction to Statistical Pat- 
tern Recognition, Academic Press. 
+ Fukunaga, K., Hummels, D., 1987a: Bias of Nearest 
Neighbour Error Estimates, IEEE PAMI, vol. 9, no. 
1, pp. 103-112. 
+ Fukunaga, K., Hummels, D., 1987b: Bayes Error 
Estimation Using Parzen and k-NN Procedures. 
IEEE PAMI, vol. 9, no. 5, pp. 634-643. 
+ Huber, P., 1981: Robust Statistics, Wiley, New 
York. 
* Jain, A., Dubes, R., Chen, C., 1987: Bootstrap Tech- 
niques for Error Estimation. IEEE PAMI, vol. 9, 
pp. 628-633. 
+ Raudys S., Jain, A., 1991: Small Sample Size 
Effects in Statistical Pattern Recognition: Recom- 
mendations for Practioners. IEEE PAMI, vol 13, 
no. 3, pp. 252-264. 
fu m TA LT MM "TP eo008 6 ed 
N o O mnm Po t 
He m o uU co
1
2
...
343
344
345
346
347
...
1000
1001
Full text: XVIIth ISPRS Congress (Part B3)

Access restriction

Copyright

Note to user