Full text: XVIIIth Congress (Part B3)

Cy, 
-dimensional data. 
of optimality and 
eighted supervised 
set of appropriate 
oplied this method 
vere obtained from 
hey were found to 
ventional method. 
ed the accuracy of 
re between classes 
al component anal- 
; a lot of computa- 
1s of channels. The 
es are not selected 
Canonical analysis 
nd gives better re- 
tracts the features 
ty among classes. 
e for significance- 
es as linear combi- 
components which 
nalysis. Each fea- 
lering the distance 
istance satisfies a 
tion for evaluating 
cific purpose. The 
by comparing the 
xtracted features. 
X TRACTION 
| we can get train- 
n image to derive 
ihe characteristics 
     
   
      
   
   
     
  
   
   
    
   
     
   
    
    
   
   
   
    
   
    
    
   
     
    
     
  
  
   
   
  
   
   
   
   
   
   
     
     
  
  
   
   
   
   
   
of most classes included in the image. 
We denote hyper-dimensional data (N dimension) by a 
vector y — (yi,::,yw) (': transpose), and suppose 
that they are classified into one of, say, n classes. Then, 
y can be decomposed into class mean y, and within-class 
dispersion y.: that is, y is written as 
Yi; = Ya, + Ye; (1) 
=1~n j=1 mi), 
(see Fig. 1), where y;; is j-th data of class i. We write 
.the covariance matrix of y, y, and y. as Cyy, Ca and Ce 
respectively. We call Ca and Ce between-class and within- 
class covariance matrix, respectively. Here, we assume 
that the covariance matrix of each class is identical. This 
assumption is rather reasonable from the view point of the 
generality of training data (Fujimura, 1981). 
X 
Yai 
Yai 
  
  
0 X 
Fig. 1 Description of data 
2.2 Feature Extraction 
Here, for simplicity we consider two cases where one and 
two most important classes should be discriminated from 
all the other classes. 
In general, classification accuracy increases as the separa- 
bility" of classes increases. We use separability to evaluate 
the performance of features extracted. We extract the fea- 
tures which maximize the separability of a particular pair 
of classes that we wish to discriminate. 
Our method proposed here consists of two steps of pro- 
cessing: pre-processing and feature extraction. 
In the pre-processing, hyper-dimensional data y — (yi, 
--,YN)' are reduced and normalized to m (m « N) 
components z = (z1,:-+, 2m)’ by a linear transformation 
z = A'y. From the assumption on C., the within-class 
dispersion of each class in the original space has the same 
ellipsoidal shape shown in Fig.1. After transformation, 
they are normalized into an m dimensional sphere. This 
makes the space uniform: this means that the distance 
measured in terms of variance does not have directional- 
ity in the space. 
In the second step, features are successively extracted 
(Kiyasu,1993, Fujimura,1994) until there remains no class 
which has distance from the particular classes less than the 
  
*We used the divergence (Kullback, 1959) as a measures of 
separability. We call it as distance in the rest of this paper. 
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996 
   
minimum distance obtained so far. Feature extraction is 
done by determining sub-space in the feature space: that 
is, by making a linear combination of z as a' z, where a is 
an m dimensional weight vector which we call here feature 
vector. Thus, feature extraction is no other than the deter- 
mination of a feature vector. As the space is uniform now, 
the direction of an optimal feature vector which discrim- 
inates between two classes is obtained just by connecting 
the centers of these classes. The feature vectors obtained 
are orthogonalized to make independent. 
The procedures for determining successive feature vectors 
is as follows: 
(1) First, we set an optimal feature vector a; between the 
two nearest classes among the prescribed classes. 
(2) Next, we evaluate the separability on a; for all the 
combination of the prescribed classes. 
(3) If there is any pair of prescribed classes which does 
not have enough separability, we set an additional 
feature vector a? between them. We ortho-normalize 
the new vector a; with a; as shown in Fig. 2, so that 
this feature is independent of the first one. 
(4) Features are successively extracted in the same way 
until all the distance among the prescribed classes are 
larger than the minimum distance obtained so far. 
(5) Then, we apply the procedures (2)~(4) to the dis- 
tance among the prescribed and the other classes. 
When only one class is prescribed, the procedure starts 
from setting a feature vector between the class and its 
nearest class in the feature space. 
A feature a; z is equivalent to (A a,) y expression using 
original data y, because z — A'y, where (A a;) means the 
weighting factor for spectral data. 
  
  
0 Z1 
Fig.2 Feature vectors discriminating between 
two classes 
3. EXPERIMENTAL RESULTS OF 
FEATURE EXTRACTION 
We acquired data for five growth-states of tree leaves (A~E: 
from young to fallen), soil, stone and concrete by using an 
imaging spectrometer which we developed. We obtained 
411 dimensional data from the sensor and used for the 
experiments. For estimating the mean and the variance 
of each class, 45 training data were used for each class. 
235 
  
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.