ul 2004
he WT
/ersions
ion of a
cale (or
> useful
Several
e been
0; Hsu,
feature
velet or
oÜn the
lents is
sociated
res for
lesigned
" feature
needed
fication.
: DAFE
's at full
ptimizes
blem of
= ee m eem mend
| feature
esolution
the low-
ximation,
detailed
f xima
ılgorithm
e Mirror
put. The
(1)
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B7. Istanbul 2004
d pl Y gln-2pha[n]- a, * [2p] Q)
N=
where A[n]- h[-n] and gn] = g[-n]. a, is the approximation
coefficients at scale 2/, and q i and d ,, are respectively the
/ J
approximation and detail components at scale 2/*'. There are
some necessary and sufficient conditions associated with the
conjugate mirror filters h and g, so that the perfect
reconstruction of signal x can be achieved without losing
information. Figure 2 shows the diagram of a fast wavelet
decomposition calculated with a cascade of filtering with 4A and
g followed by a factor 2 sub-sampling. Assume that the length
of 4. is N, one may notice that the sub-sampling procedure in
/
the wavelet decomposition shown in Figure 2 which reduces the
length of 4, | to N/2 achieves the dimensionality reduction of
1+
a, In practice, the original signal x in Figure 2 is always
J
expressed as a sequence of coefficients 0, A multilevel
orthogonal wavelet decomposition of 4, is composed of
wavelet coefficients of signal x at scales2^ « 2/ « 2^ plus the
remaining approximation at the largest scale 2" :
[(2 3, 1:0] (3)
It is calculated from a, by iterating formula (1) and (2).
9-4
Figure 2. Fast orthogonal wavelet decomposition
2.) Linear Wavelet Feature Extraction
The sub-sampling shown in Figure 2 motives us to reduce the
dimensionality of hyperspectral data by wavelet decomposition.
Firstly, the wavelet decompositions of (1) and (2) were
implemented on the hyperspectral data, and then only the
M z 2" first scaling and wavelet coefficients at scales 2/ » 2'
are selected as features. One may prove that the selected
features [{d;};,,a,] are useful for data representation.
FAIRS) 2
Because the linear wavelet transformation of x from large scale
wavelet coefficients are equivalent to the finite element
approximation over uniform grids, we call this method Linear
Wavelet Feature Extraction (Linear WFE).
In this method, the large amplitude wavelet coefficients at small
scales would not be selected as features. However, the wavelet
coefficients with large amplitudes are generated by the
singularities of the spectral curve which may involve important
information for representation or classification. Hsu (2003)
suggested that the approximation and detail components at each
scale of linear WFE should be combined together to extract
885
better features of hyperspectral images for classification. This
can be done by non-linear wavelet feature extraction.
2.3 Non-Linear Wavelet Feature Extraction
Linear WFE method which selects the M wavelet coefficients
independently of original spectrum x at larger scales can be
improved by choosing the M wavelet coefficients depending on
the x. This can be done by sorting the coefficients
Hd] calculated by the multilevel orthogonal wavelet
decomposition in decreasing order. Then the M largest
amplitude wavelet coefficients are selected as the important
features of x for classification. The non-linear approximation
calculated from the M largest amplitude wavelet coefficients
including the approximation and detail information can be
interpreted as an adaptive grid approximation, where the
approximation scale is refined in the neighborhood of
singularities (Mallat, 1999). Thus this feature extraction method
based on the non-linear approximation is called Non-Linear
Wavelet Feature Extraction (Non-linear WFE).
2.4 Best Basis Feature Extraction
2.4.1 Wavelet Packets: Wavelet packets were introduced
by Coifman et al. (1992) by generalizing the link between
multiresolution approximations and wavelets. In the orthogonal
wavelet decomposition algorithm described in Section 2.1, only
the approximation coefficients are split iteratively into a vector
of approximation coefficients and a vector of detail coefficients
at a coarser scale. In the wavelet packet situation, each detail
coefficients vector is also decomposed into two parts using the
same approach as in approximation vector splitting. This
recursive splitting shown in Figure 3 defines a complete binary
tree of wavelet packet spaces where each parent node is divided
in two orthogonal subspaces. The nodes of the binary tree
represent the subspaces of a signal with different time-
frequency localization characteristics. Any node in the binary
tree can be labelled by ( j, p), where 2/ is the scale and p is the
number of nodes that are on its left at the same scale. Suppose
that we have already constructed a wavelet packet space Ww?
and its orthogonal basis q^ SW (-2 at node ( j, p).
The two successor wavelet packet orthogonal bases at the
children nodes are defined by the splitting relations (Coifman et
al.; 1992; Mallat, 1999):
Ss
y = Y nn? 0 - 2/2) (4)
yit = enw! (t-2"n) (5)
n=-2
© re € 2p (3,, 2 j4l ‘
One...may prove that BZ) ner and
j+1
RE Bp, Ay PY are orthonormal bases of two
B, = {Vi (7 2 nts S e €
orthogonal spaces w?^ and wy?n"! such that
i+ J+
2p jap p
WOW; 2W|
(6)