Full text: XVIIIth Congress (Part B2)

ods to 
ls, are 
y some 
ls are 
1ethod 
| made 
rences 
to two 
eature 
ion in 
which 
atures 
ion. In 
ind we 
et of Y 
In the 
ility of 
ossible 
eature 
lection 
ng the 
sional 
indant 
used 
ind of 
ations 
LS 
(1) 
is how 
mation 
| space 
tion is 
rithm. 
sed on 
bilistic 
1e idea 
re the 
und by 
evaluating the value of criterion function for all possible 
combinations, but this is usually too time consuming. 
For example, if the dimension of original feature space 
is 20 and the dimension of transformed feature space is 
10, 184756 different cases should be evaluated. The only 
optimal search algorithm, which implicitly evaluates all 
possible subsets is called branch and bound-algorithm. 
Other, nonoptimal algorithms are for example sequential 
forward selection and sequential backward selection. 
The problem in feature extraction is to choose good 
transformation matrix A. Usually transformation is 
limited to linear form. Matrix A can be determined by 
using same criterion functions as in feature selection. In 
this case optimal matrix A is chosen so that criterion 
function J(Ay) is maximized. Usually maximization is 
only possible by using numerical optimization and it is 
quite time consuming. Another alternative in feature 
extraction is use Karhunen-Lôwe transformation which 
is discussed in next chapter (Devivjer, 1982). 
3. KARHUNEN-LÓWE TRANSFORMATION 
Karhunen-Lówe transformation is based on discrete 
Karhunen-Lówe expansion. In this transformation 
original information is preserved as well as possible by 
approximating original feature vector using several 
terms of expansion. 
3.1 Karhunen-Lówe expansion 
We have d-dimensional random vectors y and we can 
represent these vectors without error by the summation 
of d linearly independent vectors as 
d 
yes NS x0; (2) 
i=1 
where x; are the coefficients of the basis vectors ¢ , Basis 
vectors are orthonormal: 
Tnm lord (3) 
ol E for izj. 
In this case, the components of vector x can be computed 
by 
X; y, i71... (4) 
So, x is orthonormal transformation of y. If we want to 
decrease the dimensionality of the feature space (d>m) 
we simply do not use all basis vectors à, but select best 
basis vectors. The best basis vectors minimize mean 
Squared error between original vector y and its 
approximation y. The mean squared error can be written 
as 
375 
d 
£7 Y e, 5) 
i=m+1 
We notice that discarded basis vectors affect to error. 
Matrix X is autocorrelation matrix of y, also covariance 
matrix can be used. Optimum choice for basis vectors is 
those which satisfy 
x6; - À; 0 (6) 
or eigenvectors of X. Combining equations (5) and (6) 
mean squared error becomes 
we Foun e 
i=m+1 
Mean squared error is minimized when discarded 
eigenvectors correspond to smallest eigenvalues 
(Devivjer, 1982)(Fukunaga, 1990). 
3.2 Summary of Karhunen-Lówe transformation 
Presented results can be put on algorithmic form: 
1. Compute the correlation or covariance matrix X of y. 
2. Compute the eigenvalues and corresponding 
eigenvectors of X. Normalize the eigenvectors. 
3. Form the transformation matrix A from the m 
eigenvectors corresponding to the largest eigenvalues 
of X. 
4. Compute transformed feature vectors using equation 
(1) (Tou, 1974). 
4. SELF-ORGANIZING NEURAL NETWORKS 
An artificial neural network (referred as neural network 
after this) is a parallel, distributed signal or information 
processing system, consisting of simple processing 
elements, also called nodes. Processing elements can 
possess a local memory and carry out localized 
information processing operations. In the simplest case 
processing element sums weighted inputs and passes the 
result through nonlinear transfer function. Processing 
elements are connected via unidirectional signal 
channels called connections. The connections are usually 
weighted and those weights are adapted during training 
of the network. Learning of the network is based on the 
adaptation of the weights. 
The neural network models can be characterized using 
their properties like connection topologies, processing 
element capabilities, learning algorithms, problem 
solving capabilities etc., and models can differ greatly. 
Neural networks try to imitate a biological nervous 
system and its properties like memory and learning. 
(Lippmann, 1987). 
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B2. Vienna 1996 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.