Classification
Accuracy
Dimensionality —
Figure 7. Accuracy vs dimensionality [8]
This is termed as Hughes phenomena (after its inventor) or
peak phenomenon.[8]
The explanation for this behavior is "For a fixed sample size,
as the number of features are increased, with corresponding
increase in number of unknown parameters, even though the
seperability may increase, The resulting classification accuracy
degrades for a fixed sample size as shown in Figure 9.
For a linear classifier the number of training samples should be
proportional to the number of features for reasonable parameter
estimation. For a quadratic classifier, the number of training
samples should be proportional to the square of the number of
features.
The methods or techniques that address the issue of high
dimensionality of the data can be broadly categorized as
dealing with
» Datareduction
Or
» Classification
Some of the methods/techniques addressing the issues are:
12. DATA REDUCTION
Commonly used data reduction techniques are principal
component, Fisher discriminant and Maximum Noise Fraction
(MNF) or Noise Adjusted PCA (NAPC). Difficulty associated
with
eo The PCA concentrates variance without reference to
the class separability.
eo In the Fisher discriminant if the difference in the
class means is small, selected features may not be
reliable. If one mean vector is very different from
other mean vectors, the between class covariance
matrix may not be representative of all classes.
e In the case of MNF noise covariance matrix must be
available or approximated.
The projection pursuit method overcomes the limitations
mentioned above in the process of data reduction.
12.1 Projection Pursuit [7] : Original idea of projection
pursuit is to select potentially interesting projections by the
local optimization over projection directions of some index of
IAPRS & SIS, Vol.34, Part 7, “Resource and Environmental Monitoring", Hyderabad, India,2002
52
interestingness. Projection pursuit is numerical optimization of
criteria in search of the most interesting low dimensional linear
projection of a high dimensional data cloud. Projection pursuit
selects “interesting” lower dimensional projection from high
dimensional data by maximizing or minimizing a function
called “Projection index”.
The idea behind the projection is to maximize the separability
of classes in the projected lower dimensional space. The
separability between classes can be defined either in Euclidean,
weighted Euclidean -or Mahalanobis manner. Originally
assumed value of A is iteratively changed such that the
projection index I which is function of projected samples
Y=ATX 3)
is optimized
Projection pursuit computes A, optimizing projection index I
T
(A X)
Steps involved in projection pursuit processing are
Data
N
Projection
Y= ATX
Using initial guess matrix A
>
dl
v
Estimation of parameters at
subspace
— cor. 0
Recpmputation to A such
that I (AT X) is optimized
v
Projection
Yz ATX
Y
Output
Figure 8. Schematic showing the processor [7]
13 CLASSIFICATION
13.1 Hierarchical Multi-classifier System.[9] Application of
classification methods discussed previously is not
straightforward due to large number of bands. Two approaches
can be used to mitigate curse of dimensionality: one approach
is to transform the input space into manageable small feature
space, example projection pursuit. The other approach is to
design a decision tree. The manual construction of decision tree
is not difficult if the number of input features are not very
large. But in the case of high data dimensionality manual
construction decision tree is extremely difficult. The alternative
is to use a method, which can uncover the domain knowledge
automatically from the data.
Modu
classif
instea
systen
learni
“Set 7
meta «
two 1
autom
the re
classe
for ea
associ
132
OVETC(
MLC.
adapti
The i
estim:
alread
trainit
classi!
Semi-
by a
classi
either
Advai
Use o
estim:
reduc
labele
As se
estim:
distril
The ¢
hence
in be
accur
Adap
small
effort
The
is use
statis!
reduc
undes
13.31
Any s
class i
This :
data a
is trai
data :
incluc
data, °
descri