Proceedings of the Symposium on Global and Environmental Monitoring: Proceedings of the Symposium on Global and Environmental Monitoring

586 
2.2 Test Site Data 
In order to assess the classification 
accuracy, land use test site data were 
prepared. This data cover 2km x 10km area 
within the area covered by the image and 
each 25m x 25m pixel of this test site 
data are assigned specific land use code. 
Fig.2 shows the test site data and Table 
1 shows the categories and number of 
pixels of each category. 
As these data are based on land use 
category, classification classes do not 
necessarily coincide with these 
categories. In order to assess the 
classification accuracies using this test 
site data, these 44 categories were 
merged to 5 major categories also shown 
in Table 1. Classification accuracies 
were evaluated by these 5 major 
categories. 
In the course of this study, some 
problems were revealed for this test site 
data. The largest problem was that this 
test site data has contained unbalanced 
land use. In order to avoid this problem, 
several areas were added to this test 
site data. This new test site data are 
shown in Fig.3 and the number of pixels 
of each category are shown in Table 2. 
From now on, the original test site data 
are called as test site a while the 
modified one will be called as test site 
b. 
METHODS 
3.1 Clustering 
The clustering method used in this 
experiments is a hierarchical clustering 
using Ward method. C-means clustering was 
not used because of its difficulty in 
obtaining optimal parameters. 
3.2 Classifier 
A maximum likelihood classifier was used 
for the classifier. 
3.3 Experimental Procedure 
Samplings for clustering were done by 
taking pixels at each grid points of 
orthogonal grids. Number of samples in 
this experiments vary from 900(30 x 30) 
to 2500(50 x 50). 
The final number of clusters were 
indicated in each clustering processing. 
After the clustering, clusters with less 
than 6 pixels were eliminated because of 
the restriction of a maximum likelihood 
classifier. Therefore, there are two 
types for number of clusters, i.e. 
indicated number and resulted number. As 
for indicated number, 10 to 160 clusters 
were tried in the experiments. 
process was introduced. Two kinds of 
assignment process was used in the 
experiments. One is to assign the target 
cluster to the category with the largest 
number of classified pixels of that 
cluster. Another assignment is to assign 
the target cluster to the category with 
the largest percentage of classified 
pixels of that cluster. The former is 
called an area assignment while the 
latter is called a percentage assignment, 
hereafter. 
4 RESULTS AND DISCUSSIONS 
4.1 Results for test site a 
Table 3 shows the results of 
classification accuracies evaluated by 
test site a. In this table, horizontal 
columns corresponds to approximate number 
of clusters used in the classification 
and vertical columns correspond to number 
of samples used for clusterings. In each 
column, left hand side figures in 
brackets correspond to indicated number 
of clusters while right hand side figures 
correspond to resulted number of 
clusters. Table 3(a) shows area weighted 
mean classification accuracies while (b) 
shows arithmetic mean classification 
accuracies. From Table 3(a), following 
conclusions were obtained: 
(1) Variations of classification 
accuracies are very small, i.e. about 
3.6% at the maximum. 
(2) Almost no definite dependence of 
classification accuracies on sample size 
and cluster numbers can be observed. 
Classified results were compared to 
certificate the conclusion (1). Fig.4 
shows some examples of classified 
results. Fig.4(a) shows the result of 
supervised learning for comparison. 
Fig.4(b) shows the result of the case 
when sample size was 30 x 30 and cluster 
numbers were 10 while Fig.4(c) shows the 
result of the case when sample size was 
45 x 45 and cluster numbers were 74. 
Compared to supervised result. Fig.4(c) 
seems far more better than Fig.4(b) and 
3.6% accuracy difference seems to small. 
One of the reasons of these phenomena can 
be considered to be the fact that areas 
of each category in the test site data 
are not balanced. Instead of taking the 
area weighted mean of classification 
accuracies, an arithmetic mean of 
classification accuracies was calculated. 
The results are shown in Table 3(b). 
The following conclusions can be derived 
from this result compared with Table 
3(a) : 
The process to assign each cluster to a 
specific land use category is a time 
consuming and difficult process. In order 
to avoid classification accuracy 
variations in this process caused by 
human factors, an automatic assignment 
(1) The absolute accuracies have 
decreased about 7% from the area weighted 
mean. 
(2) If we carefully watch each column of 
sample size, it seems that there exist a 
peak in each column.
1
2
...
612
613
614
615
616
...
951
952
Full text: Proceedings of the Symposium on Global and Environmental Monitoring (Part 1)

Access restriction

Copyright

Note to user