Proceedings; XXI International Congress for Photogrammetry and Remote Sensing: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing

chen, jun
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B7. Beijing 2008 
286 
nonlinear methods, and the kernel methods provide a new 
approach to the feature extraction. Some research scholars have 
studied the feature extraction methods of hyperspectral based 
on kernel function, such as kernel principal components 
analysis (KPCA) and kernel Bhattacharyya feature extraction 
(KBFE) (Lu, 2005). 
In the feature space H , the Fisher discriminant function can be 
defined as 
J x (w) = 
where w is a nonzero vector. 
(4) 
In 2000, the Generalized Discriminant Analysis (GDA) was 
brought forward by Baudat (Baudat et al., 2000), which is the 
nonlinear extraction of Linear Discriminant Analysis, has been 
successfully used in face recognition (Gao et al., 2004) and 
mechanical failure classification (Li, 2003). In this paper, we 
first introduced the mathematical model and the solution of the 
GDA, applied this method to extract features from the 
hyperspectral image. Then we made experiments with two 
groups of the hyperspectral images which were obtained by 
different kinds of hyperspectral imaging system. At last the 
result was analyzed. The main contents were described in detail 
as follow. 
2. GENERALIZED DISCRIMINANT ANALYSIS 
Through mapping samples from the input space to the feature 
space with high dimensions, we carry on the liner methods of 
feature extraction in this feature space. Because of the 
dimension in the feature space is very large, and it may be 
infinitude, in order to avoid deal with the samples 
perceptibly ,we use the kernel functions to compute the inner 
product in the feature space. 
2.1 Theory of Feature Extraction Based on GDA 
Suppose there are C classes of samples, which are belong 
to co x ,co 2 ,L , co m , and the original sample x has n dimensions, 
so x e R". If we map the sample x to feature space H with 
higher dimensions by the mapping tj) , in the feature space, 
x will be </>(x) G H .If all the samples are mapped to the future 
space H , the intraclasses scatter matrix S^, , the interclasses 
scatter matrix and the total scatter matrix Sf of the training 
samples, will be described as follows: 
st)-»?)(#*; )' in 
A ¿=1 7=1 
sf =-f (3) 
A J= i 
where N t is the amount of training samples belonging to the 
class co i , N is the amount of all the training samples. In the 
feature space H , (f>{:c') is the sample j ( j = 1,L N i ) of 
class / (i = 1,L , C ), <p(Xj) is the sample j (j = 1,L ,N) of 
all the samples, mf = E{tf>(x) | co ( .} is the mean of samples in 
c 
the class/, m* = ^P(a> i )mf is the mean of all the samples. 
¡=1 
Sf, S* and Sf are all nonnegative matrixes. 
In the feature space H , Generalized Discriminant Analysis 
(GDA) is to find a group of discriminant vectors (w x ,L w d ), 
which can maximize the Fisher discriminant function (4), and 
all the vectors are orthogonal. 
wjwj = 0, Vi * j;i,j = 1,L ,d 
The first discriminant vector w x of GDA is also the fisher 
discriminant vector, which is the eigenvector corresponding to 
maximal eigenvalue of eigenfunction Sfw = ^S^w .If we 
know the first r discriminant vectors w x , L ,w r , the 
r +1 discriminant vector H> r+I can be gotten through resolving 
the follow optimization problem. 
max(J,(H’)) 
Model I: < 
w T j w = 0, j = 1,L , r 
(5) 
we H 
According to the theory of the reproducing kernel Hilbert space, 
the eigenvectors are linear combinations of H elements, so 
w can be expressed as 
N 
w = J^a'<p(x i ) = </> a (6) 
i=i 
where (f> = (tf)(xf),L ,(/)(x N )) , a - («’ ,L ,a N ) T , a is 
optimal kernel discriminant vector, which can map the sample 
<t>(x) in the feature space to the direction w 
w T <j){x) = w T <f> T (f){x) = a T £ x (7) 
where = (k(x l ,x),k(x 2 ,x),L ,k(x N ,x)) T .For the 
sample x e R n , % x is the kernel sample vector which relates to 
x x , x 2 ,L , x N , so the kernel matrix is 
K = (^ X2 ,L ,<?„) 
In the feature space H , the mean of each classes and the mean 
of all the samples can also be mapped to the direction w 
II 
5 
1 ‘ 
—Yf(xi) 
V,tr 
T 
=a n, 
(8) 
w T m* = a T (f)' 
it«* 
=a T p 0 
(9) 
where 
(10) 
Pi = 
,)■№)] 
* =1 
A; k=1 
/ 
Po ~ 
(11) 
According to the Equation (8),(10)and (11),there are 
w T S( w - a T K h a 
(12) 
w T Slw = a T K w a 
(13)
1
2
...
311
312
313
314
315
...
524
525
Full text: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing (Part B7-1)

Access restriction

Copyright

Note to user