Proceedings; XXI International Congress for Photogrammetry and Remote Sensing: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing

chen, jun
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B7. Beijing 2008 
1296 
Generally speaking, Yl is not larger than m. if so, principle 
component analysis is used to reduce dimension from 171 to 
Yl .Then, if we arrange both the observed variables V ( . and the 
component variables S i into vectors respectively, a matrix form 
of (1) can be given by 
V = As (2) 
Where,V = (v 1 ,V 2 ,V3---V m ) T ,5 = (5 1 ,5 2 ,5 3 ---5j T a 
nd A is an unknown constant matrix, which is called the mix 
matrix. Then we can define a demixing matrix W, which can be 
given by: 
y = Wv (3) 
Our target is to estimate the independent components S, from 
all observed signals alone, which is equivalent to find the 
optimum estimate matrix W in (3), which makes y , the 
estimate of variables s, as statistically independent as possible. 
Because a linear mixture of Gaussian variables is still a 
Gaussian variable, at most one source in the mixture model can 
be allowed to be a Gaussian type. 
In detail, the algorithm falls into three steps($h3§£, 2006). Firstly, 
the preprocessing step is employed to whiten the mixing data in 
order to remove the correlation between variables and reduce 
data dimension (if necessary).Secondly, a subjective function 
L(fV) using the demixing matrix W as argument is defined to 
measure the independence of output variables y .Finally, a 
optimization algorithm is used to find a optimal estimation 
W ,which makes L(fV) has the extremum, while W is 
equal to W . The implementation of ICA can be seen as a 
combination of a objective function and a optimization 
algorithm(Hy varinen, 1999). The key point of algorithm is the 
definition of statistical impendence. In general, there are three 
kinds of objective functions: the maximum of non-gaussianity, 
the minimum of mutual information and maximum likelihood. 
It has been improved that three criterions are equal to each 
other in the term of information’s(Hyvarinen, 1997),the 
difference between them is laying in the optimization algorithm, 
which means different calculating complexity. For non- 
gaussianity, kurtosis and negentropy are often used as 
performance criterion. For a random variable y with zero 
mean, its kurtosis can be defined as follows: 
kurt(y) = E{y , }-3(E{y 2 ]) 1 (4) 
Meanwhile, its negentropy is presented in (6), which is the 
difference between the entropy of a gauss rand variable that is 
same as y in standard deviation and the entropy of y . 
H(x) = p ( x = a i) lo 8 p ( x = a i) (5) 
/ 
ne(y) = H (y glnm )-H(y) (6) 
A large collection of literature on optimization algorithms have 
been proposed in the last ten years. Among them, a fast and 
robust ICA (Fast-ICA) algorithm proposed by Hyvarinen 
(Hyvarinen, 1997) is widely used, which is a fixed-point 
algorithm based on an minimum of negentropy. Initially, this 
algorithm is introduced using (6) as a criterion function, and is 
subsequently extended into function (7) in order to reduce 
complexity of computation. 
ne(y) = [E{G(y)}-E{G( V )}] 1 (7) 
y = W T X (8) 
In which, V is a Gaussian variable of zero mean and unit 
variance and y is a variable of zero mean and unit variance. 
Function (/(•) can be practically any non-quadratic functions. 
Two commonly used functions are listed below: 
O) = - log cos(tf, y) a x g [1,2] 
a 
G 2 (y) = -exp(-y 2 /2) 
In our experiment, G 2 (y) = — exp(—y 2 / 2) is used for 
iteration. Our target is finding an appropriate value for vector 
W in order that E{G(w T *)} has a maximum, which is 
equivalent that the deviation of E{G(w T x)} is equal to 0. In 
formula (7), g(*) is the deviation of G(*) 
E'{G(y)} = E'{Gxg(w T x))} = 0 (8) 
According to Newton Iteration algorithm, we can get iteration 
method simplifies to: 
- + E{xg(w T x)} 
E{g'(w T x)} 
Then we get following fixed-point algorithm for ICA: 
1. Take a random initial vector W of norm 1. 
2. Let w + = —is + is{:x:g(w r .x)} 
3. Divide W + by its norm, then get a new W .
1
2
...
369
370
371
372
373
...
496
497
Full text: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing (Part B7-3)

Access restriction

Copyright

Note to user