Proceedings of the Symposium on Global and Environmental Monitoring: Proceedings of the Symposium on Global and Environmental Monitoring

After this procedure has finished the final 
classification happens using the resulted 
dataset and the 1-NN classifier. By the 
so called Multiediting approach, the number 
of samples can be further reduced by repea 
ting the above editing procedure until no 
editing occurs. In /DevKit80/ both theore 
tical and experimental evidence is given 
for proving the most favorable result, 
that the above procedure converges asympto 
tically (in the number of iterations) to 
the Bayes decision rule. 
After the multiediting algorithm, the data 
is nicely clustered (because all the misc- 
lassified samples are rejected). The 1- 
NN classifier creates piecewise linear 
decision boundary (being approximation of 
the Bayes boundary). This boundary is ac 
tually defined by a small subset of samp 
les belonging to the outer boarders of 
the clusters. It is clear that only these 
sample points are needed for the 1-NN clas 
sifier. The aim of condensing is to find 
such representative points. 
So after multiediting and condensing, a 
mostly powerful 1-NN classifier can be 
used. It has the property of approximating 
the Bayes decision boundary and being very 
fast to compute. Our k-NN classifier has 
been implemented by using this technique. 
3.2 The Average Learning Subspace Method 
The subspace methods are reported in 
/Oja83/. A subspace in the n-dimensional 
pattern space is spanned by p linearly 
independent, orthonormal basis vectors 
a i . The dimension of the subspace is then 
p. From the point of view of classifica 
tion, a subspace restricts the possible 
directions in it, while the lengths remain 
undetermined. Suppose we have M classes 
in the classification problem, each being 
represented by a subspace, with dimensiona 
lities p^^ . Usually p i is much lower than 
the original dimension. The simple classi 
fication rule then states that: if the 
distance between the pattern vector x and 
subspace i is smaller than between the 
pattern vector and subspace j, then classi 
fy x in class i. The distance from the 
subspace can be computed as follows: 
d(x,L) = | x | 2 - Epix 1 ^ ) 2 , 
where u t are the orthonormal basis vec 
tors . 
Since |x| 2 is the same for each class, it 
can be dropped and the classification rule 
consists only of inner products. So, if 
the subspaces dimensions are small, the 
classifier is very fast. 
■me essential question in the subspace 
method is how to actually construct the 
class subspaces to obtain optimal perfor 
mance. The CLAFIC method /Watana69/ forms 
each subspace by minimizing the mean square 
error of the distances of a training set. 
This can be shown to be equivalent of maxi 
mizing 
£ (u• T Cu. ), 
p v j j ’' 
The eigenvalue decomposition solves this 
problem and usually the first few eigenvec 
tors span the subspace. 
A serious drawback of the CLAFIC method 
is that each class subspace, although de 
pending on the statistics of the class, 
is formed independently from the other 
classes. So, two classes overlapping each 
other may be very hard to discriminate. 
This leads to the ALSM-method which adapts 
itself better to this situation by learn 
ing. In the ALSM-method, the autocorrela 
tion matrices are updated according to 
misclassifications. This means that, if 
either a sample vector of class i is misc 
lassif ied to another class (j), or a sample 
vector of another class (k) is misclassi- 
fied to i, the learning phase of the clas 
sifier updates the autocorrelation matri 
ces of each class by: 
R = R + £ a*S m <i ' j) - £ y3*S m (k ' i) ‘ 
m m-1 m r' m 
Here a and /3 are small constants, and 
should be small enough to avoid overshoot 
ing. In this learning phase, the subspa 
ces will be iterated sufficiently long so, 
that the classification of the training 
set becomes stable. The algorithm can be 
proven to converge /Oja83/. The constants 
a and /? were both set to values 0.005 in 
this project. 
4. SUMMARY OF THE RESULTS 
The test material consists of two 1:15 000 
aerial images both digitized in 4500*4500 
format, one SPOT scene from rural and one 
SPOT scene from urban area. Two indepen 
dent test sites were created. The other 
was used as a training sample and consisted 
of approximately 1300 samples for each 
class. The other was used as an indepen 
dent test set and consisted of approximate 
ly 5000 samples for each class. The clas 
sification tried to separate five classes, 
urban or residential areas, two types of 
forest areas, pasture land or parks, and 
fields. In case of multichannel imagery, 
the textural descriptors were computed 
from channel 3 (infrared). 
The practical classification was performed 
using a window of 31*31 pixels in the ae 
rial images and using a window of 13*13 
in the SPOT images. 
Subspaces vs. extracted ad hoc features 
Some preliminary runs were made to have an 
idea, if there is some difference in using 
the traditional feature descriptors of 
the cooccurrence statistics and the Fou 
rier power spectrum, compared to the stra 
tegy of using these descriptors as such. 
These tests were performed on the basis 
of the aerial images only. 
The dimensional reduction achieved by an 
orthogonal transformation was surprisingly 
high. The dimension of the final subspace 
was usually 5-10, although the original 
where C is the autocorrelation matrix.337
1
2
...
358
359
360
361
362
...
951
952
Full text: Proceedings of the Symposium on Global and Environmental Monitoring (Part 1)

Access restriction

Copyright

Note to user