CMRT09

stilla, uwe

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009
COMPLEX SCENE ANALYSIS IN URBAN AREAS BASED ON
AN ENSEMBLE CLUSTERING METHOD APPLIED ON LIDAR DATA
P. Ramzi*, F. Samadzadegan
Dept, of Geomatics Engineering, Faculty of Engineering, University of Tehran, Tehran, Iran -
(samadz, pramzi)@ut.ac.ir
Commission III, WG II1/4
KEY WORDS: LIDAR, Feature, Object, Extraction, Training, Fusion, Urban, Building
ABSTRACT:
3D object extraction is one of the main interests and has lots of applications in photogrammetry and computer vision. In recent
years, airborne laser-scanning has been accepted as an effective 3D data collection technique for extracting spatial object models
such as digital terrain models (DTM) and building models. Data clustering, also known as unsupervised learning is one of the key
techniques in object extraction and is used to understand structure of unlabeled data. Classical clustering methods such as k-means
attempt to subdivide a data set into subsets or clusters. A large number of recent researches have attempted to improve the
performance of clustering. In this paper, the boost-clustering algorithm which is a novel clustering methodology that exploits the
general principles of boosting is implemented and evaluated on features extracted from LiDAR data. This method is a multi
clustering technique in which At each iteration, a new training set is created using weighted random sampling from the original
dataset and a simple clustering algorithm such as k-means is applied to provide a new data partitioning. The final clustering solution
is produced by aggregating the weighted multiple clustering results. This clustering methodology is used for the analysis of complex
scenes in urban areas by extracting three different object classes of buildings, trees and ground, using LiDAR datasets. Experimental
results indicate that boost clustering using k-means as its underlying training method provides improved performance and accuracy
comparing to simple k-means algorithm.
1. INTRODUCTION
Airborne laser scanning also known as LiDAR has proven to be
a suitable technique for collecting 3D information of the ground
surface. The high density and accuracy of these surface points
have encouraged research in processing and analyzing the data
to develop automated processes for feature extraction, DEM
generation, object recognition and object reconstruction. In
LiDAR systems, data is collected strip wise and usually in four
bands of first and last pulse range and intensity (Arefi et al,
2004). Clustering is a method of object extraction and its goal is
to reduce the amount of data by categorizing or grouping
similar data items together. It is known as an instance of
unsupervised learning (Dulyakam and Rangsanseri, 2001). The
grouping of the patterns is accomplished through clustering by
defining and quantifying similarities between the individual
data points or patterns. The patterns that are similar to the
highest extent are assigned to the same cluster. Generally,
clustering algorithms can be categorized into iterative square-
error partitional clustering, hierarchical clustering, grid-based
clustering and density-based clustering (Pedrycz, 1997; Jain et
al., 2000).
The most well-known partitioning algorithm is the k-means
which is a partitional clustering method so that the data set is
partitioned into k subsets in a manner that all points in a given
subset are closest to the same center. In other words, it
randomly selects k of the instances to represent the clusters.
Based on the selected attributes, all remaining instances are
assigned to their closer center. K-means then computes the new
centers by taking the mean of all data points belonging to the
same cluster. The operation is iterated until there is no change
in the gravity centers. If k cannot be known ahead of time,
various values of k can be evaluated until the most suitable one
is found. The effectiveness of this method as well as of others
relies heavily on the objective function used in measuring the
distance between instances. The difficulty is in finding a
distance measure that works well with all types of data (Jane
and Dubes, 1995). Some attempts have been carried out to
improve the performance of the k-means algorithm such as
using the Mahalanobis distance to detect hyper-ellipsoidal
shaped clusters or using a fuzzy criterion function resulting in a
fuzzy c-means algorithm (Bezdek and Pal, 1992). A few authors
have provided methods using the idea of boosting in clustering
(Frossyniotis et al., 2004; Saffari and Bischof, 2007; Liu et al.,
2008).
1.1 Related Work
Boosting is a general and provably effective method which
attempts to boost the accuracy of any given learning algorithm
by combining rough and moderately inaccurate classifiers
(Freund and Schapire, 1999). The difficulty of using boosting in
clustering is that in the classification case it is straightforward
whether a basic classifier performs well with respect to a
training point, while in the clustering case this task is difficult
since there is a lack of knowledge concerning the label of the
cluster to which a training point actually belongs (Frossyniotis
et al., 2004). The authors in (Frossyniotis et al., 2004) used the
same concept, by using two different performance measures for
assessing the clustering quality. They incorporated a very
similar approach used in the original Discrete AdaBoost
Corresponding author.

1
2
...
70
71
72
73
74
...
252
253

Full text: CMRT09

Access restriction

Copyright

Note to user