In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009
137
Evaluation of these two algorithms for clustering of the data
sets into three clusters (ground, tree, and building) is depicted in
figure 2. Figures 2c and 2d show the ¿-means clustering results
and figures 2e and 2f show the artificial bee colony algorithm
clustering results in two evaluation areas. Building class regions
are highlighted by red and vegetation class regions by green
colour in figure 2, Visual inspections shows that vegetation
class is directly associated with trees, bushes or forest and the
building class is mainly associated with building regions.
4.1 Accuracy Assessment
Comparative studies on clustering algorithms are difficult due
to lack of universally agreed upon quantitative performance
evaluation measures. Many similar works in clustering use the
classification error as the final quality measurement (Zhong and
Ghosh, 2003); so in this research, we adopt a similar approach.
In this paper, confusion matrix used to evaluate the true labels
and the labels returned by the clustering algorithms as the
quality assessment measure. If some ground truth is available,
the relation between the "true" classes and the classification
result can be quantified. With the clusters the same principle
can be applied. Mostly a much higher number of clusters is then
related to the given ground truth classes to examine the quality
of the clustering algorithm. From the confusion matrix we
calculate the Kappa Coefficient (Cohen, 1960). Although the
accuracy measurements described above, namely, the overall
accuracy, producer’s accuracy, and user’s accuracy, are quite
simple to use, they are based on either the principal diagonal,
columns, or rows of the confusion matrix only, which does not
use the complete information from the confusion matrix. A
multivariate index called the Kappa coefficient (Tso and
Mather, 2009) overcomes these limitations. The Kappa
coefficient uses all of the information in the confusion matrix in
order for the chance allocation of labels to be taken into
consideration. The Kappa coefficient is defined by:
r = NTii=iX i i-I i r i=1 (x i+ xx +i ) (4)
N 2 - £[ =1 O i+ X x +i )
In this equation, k is the kappa coefficient, r is the number of
columns (and rows) in a confusion matrix, x, 7 is entry (i, i) of the
confusion matrix, x,+ and x+, are the marginal totals of row i and
column j, respectively, and N is the total number of
observations (Tso and Mather, 2009).
Table 2 shows the confusion matrix and Kappa coefficient of ¿-
means and artificial swarm bee colony algorithms clustering in
residential dataset. The confusion matrix and Kappa coefficient
of ¿-means and artificial swarm bee colony algorithms
clustering in industrial dataset presented in Table 3.
By comparing the counts in each class, a striking difference to
the artificial swarm bee colony algorithm result is clearly
observed. For the two classes of major interest in this study, the
building class and tree class, the differences are quite
significant. Visual interpretation clearly indicates that the
building class of k-means not only include building areas but
also regions related to roads which supports the smaller number
of counts of the artificial swarm bee colony method to be more
precise. Similarly the higher number of counts for the tree class
indication (3D) vegetation regions (trees, bushes) obtained with
the artificial swarm bee colony algorithm method is supported
by visual interpretation. Overall performance of artificial bee
colony algorithm is outperforming k-means clustering
algorithm. This can be observed from the Kapa coefficient.
Table 2. Confusion matrix and Kappa coefficient of ¿-means
and artificial swarm bee colony algorithms in residential area.
C/3
C
cd
CJ
E
-i
Reference Data
Building
Tree
Ground
Total
Building
64338
1551
338
66227
Tree
3561
58692
5930
68183
Ground
54341
10509
290740
355590
Total
122240
70752
297008
490000
Kappa coefficient = 0.6927
Bee algorithms
Reference Data
Building
Tree
Ground
Total
Building
114602
3471
5686
123759
Tree
2124
61123
6144
69391
Ground
4214
7558
285078
296850
Total
120940
72152
296908
490000
Kappa coefficient = 0.8916
Table 3. Confusion matrix and Kappa coefficient of ¿-means
and artificial swarm bee colony algorithms in industrial area.
C/3
C
03
CD
E
Reference Data
Building
Tree
Ground
Total
Building
26878
2168
1108
30154
Tree
187
3707
105
3999
Ground
16443
12879
139025
168347
Total
43508
18754
140238
202500
Kappa coefficient = 0.584
Bee algorithms
Reference Data
Building
Tree
Ground
Total
Building
39528
1158
2097
42783
Tree
839
15641
1290
17770
Ground
3842
3483
134622
141947
Total
44209
20282
138009
202500
Kappa coefficient = 0.866
5. CONCLUSION
This paper presented and tested a new clustering method
based on the artificial bee colony algorithm in extracting
buildings and trees form LIDAR data. The method employs
the artificial swarm bee colony algorithm to search for the set
of cluster centres that minimizes a given clustering metric.
One of the advantages of this method is that it does not
become trapped at locally optimal solutions. This is due to the
ability of the artificial swarm bee colony algorithm to perform
local and global search simultaneously. Experimental results
for different LIDAR data sets have demonstrated that the
artificial swarm bee colony algorithm method produces better
performances than those of the ¿-means algorithm. One of the
drawbacks of the artificial artificial swarm bee colony
algorithm, however, is the number of tunable parameters it
employs.
6. ACKNOWLDGMNT
The authors would like to thank Dr. Michael Hahn from
Stuttgart University of Applied Sciences for providing the data
set used in the paper.