Full text: Proceedings International Workshop on Mobile Mapping Technology

Pl-2-8 
tion from the viewpoint of architecture design and learning para 
digm based on A1C. 
Concerning the architecture design, the size of network (the num 
ber of layers and nodes) and the type of activation functions are 
important factors. We proposed LNN architecture design based on 
the minimization of AIC. Concretely, different sized sets of LNN 
were trained with pruning, and the number of hidden nodes and 
the connections weights between nodes were determined based on 
the minimization of AIC. 
Once the architecture is fixed, the behavior of the trained model 
depends on the values of the connection weights. It is known, 
however, that AIC has a large variance, so that due to the limited 
number of training data and the presence of noise, over-training 
often presents problems. We introduced Tikhonov’s regulariza 
tion to overcome the problem of over-training. 
Finally, we designed an LNN classifier based on the proposed 
procedure and applied it to a land cover classification problem. 
Our experimental results illustrate the potential of the proposed 
design techniques. We believe that the insight gained from this 
study is complementary to a more general analysis for the gener 
alization of feed-forward layered neural networks based on infor 
mation statistics. 
REFERENCES 
[Akaike, 1974] Akaike, H., 1974. A new look at the statistical 
model identification, IEEE Trans. Automat. Contr., 19(6), 
pp.716-723. 
[Amirikian and Nishimura, 1995] Amirikian, B. and Nishimura, 
H., 1995. What size network is good for generalization of a 
specific task of interest, Neural Networks, 7(2), pp.321-329. 
[Bose and Liang, 1996] Bose, N. K. and Liang, P., 1996. Neural 
Network Fundamentals with Graphs, Algorithms, and Applica 
tions, NcGraw-Hill. 
[Curran and Hay, 1986] Curran, P.J., and Hay, A.M., 1986.. The 
importance of measurement error for certain procedures in remote 
sensing at optical wavelengths, Photogramm. Eng. Remote Sens 
ing, 52, pp.229-241. 
[Forgel, 1991] Forgel, D. B., 1991. An information criterion for 
optimal neural network selection, IEEE Transactions on Neural 
Networks, 2(5), pp.490-497. 
[Funahashi, 1989] Funahashi, K., 1989. On the approximate 
realization of continuous mapping by neural networks, Neural 
Networks, 2, pp.183-192. 
[Gallant and White, 1988] Gallant, A.R. and White, H., 1988. 
There exists a neural network.that does not make avoidable mis 
takes, Proc. Int. Conf. Neural Networks, 1, pp.657-666. 
[Hill et al., 1994] Hill, T., Marquez, L., O'Connor, M. and Remus, 
W., 1994. Artificial neural network models for forecasting and 
decision making, Int. Jour. Forecasting, 10, pp.5-15. 
[Hoerl and Kennard, 1970] Hoerl, A. E. and Kennard, R. W., 
1970. Ridge regression : Biased estimation for nonorthogonal 
problems, Technometrics, 12(1), pp. 55-67. 
[Krogh and Hertz, 1992] Krogh, A. and Hertz, A. J., 1992. A 
simple weight decay can iprove generalization, Advances in 
Nueral Information Processing Systems 4, Moody, J. E., Hanson, 
S. J. and Lippmann, R. P. eds., Morgan Kaufman Publishers. 
[Mehrotra et al., 1991] Mehrotra, K.G., Mohan, C.K. and Ranka, 
S., 1991. Bounds on the number of samples needed for neural 
learning, IEEE Transactions on Neural Networks, 6, pp.548-558. 
[Mehrotra et al., 1997] Mehrotra, K., Mohan, C. K. and Ranka, S., 
1997. Elements of Artificial Neural Networks, MIT Press. 
[Ruck et al., 1990] Ruck, D. W., Rogers, S. K., Kabrisky, M., 
Oxley, M. E. and Suter, B. W., 1990. The multilayer perceptron 
as an approximation to a Bayes optimal discriminant function, 
IEEE Transactions on Neural Networks, 1(4), pp.296-298. 
[Shimizu, 1996] Shimizu, E., 1996. A theoretical interpretation 
for layered neural network classifier, Jour. JSPRS, 35(4), pp.4-8. 
[Shimohira, 1993] Shimohira, H., 1993. A Model selection pro 
cedure based on the information criterion with its variance, 
METR, 93-16, University of Tokyo. 
[Sietsma. and Dow, 1990] Sietsma, J. and Dow, R. J.F., 1990. 
Creating artificial neural networks that generalize, IEEE Transac 
tions on Neural Networks, 4, pp.67-79. 
[Tikhonov et al., 1990] Tikhonov, A.N., Goncharsky, A.V., Ste 
panov, V.V, and Yagola, A.G., 1990. Numerical Methods for the 
Solution of Ill-posed Problems. Mathematics and Its Applications, 
Kluwer Academic Publishers. 
[Wan, 1990] Wan, E. A., 1990. Neural network classification: a 
Bayesian interpretation. IEEE Transactions on Neural Networks, 
1(4), pp.303-305. 
[Weigend and Rumelhart, 1991] Weigend, A.S. and Rumelhart, D. 
E., 1991. The effective dimension of the space of hidden units. In 
Proc. IEEE Int. Joint Conf. Neural Network, Singapore, 3, pp. 
2069-2074. 
[Yool et al., 1986] Yool, S.R., Star, J.L., Estes, J.E., Botkin, D.B, 
Eckhardt, D.W. and Davis, F.W., 1986. Performance analysis of 
image processing algorithms for classification of natural vegeta 
tion in the mountains of Southern California, Int. J. Remote 
Sensing, 7, pp.683-702.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.