COMPARISON OF CLUSTERING TECHNIQUES APPLIED TO LASER DATA
H. Arefi'*, M. Hahn!, F. Samadzadegan”, J. Lindenberger?
‘Dept. of Geomatics, Computer Science and Mathematics, Stuttgart University of Applied Sciences, Stuttgart, Germany
(hossein.arefi, michael.hahn)@hft-stuttgart.de
"Dept. of Geomatics, Faculty of Engineering, University of Tehran, Tehran, Iran, samadz Gut.ac ir
?TopScan, Rheine, Germany, lindenbergerG topscan.de
Working Group IV/7
KEY WORDS: LIDAR, First and Last Pulse data, Clustering, Quality assessment, Unsupervised learning, Feature Extraction
ABSTRACT:
During the last decade airborne laser scanning has become a mature technology which is now widely accepted for 3D data collection.
Automated processes employ the scanned laser data and the platform orientation and other parameters of the scanning system to
generate 3D point coordinates. These 3D points represent the terrain surface as well as objects on top of the terrain surface. Modern
airborne LIDAR systems are able to record first pulse and last pulse range measurements together with the signal strength to provide
more information about the reflecting surface or object.
The main goal of this paper is to investigate and compare procedures for clustering of LIDAR data. Classical clustering methods
refer to a variety of methods that attempt to subdivide a data set into subsets or clusters. The study aims at a comparison of K-means
clustering, competitive learning networks and fuzzy C-means clustering applied on range laser images. For comparison the confusion
matrix concept is employed. The accuracy evaluation is done qualitatively and quantitatively. Experimental investigations are using
LIDAR data taken from a scanning project in which the density of scanned points is around one point per square meter.
1. INTRODUCTION
Airborne laser scanning is an established technology for highly
automated acquisition of digital surface models (DSM).
Furthermore, in recent years LIDAR data has become as a
highly acknowledged data source for interactive mapping of 3D
man-made and natural objects from the physical earth’s surface.
The dense and accurate recording of surface points has
encouraged research in processing and analysing the data to
develop automated processes for feature extraction, object
recognition and object reconstruction.
LIDAR data recoded with first and last pulse range and
intensity values are considered the raw measurements in this
paper. As these recordings are sets of irregularly distributed 3D
points interpolation to a regular grid will ease processing of
LIDAR data. It is well known that interpolation will low pass
the raw data and thus some information will be lost. If the point
density is high and almost regular this disadvantage will not be
very significant. As our interest is in an investigation of
clustering techniques using LIDAR data we assume the
interpolation to be solved. The more interesting question in this
regard is on the input for clustering algorithms. In addition to
the LIDAR data, feature images reflecting texture and surface
geometry are extracted and used for clustering. Clustering is a
technique for image classification related to the unsupervised
classification procedures. The overall goal is to extract
information from the LIDAR data.
*Corresponding author
38
2. CLUSTERING TECHNIQUES
Clustering is a process of assigning pixels to categories or
clusters based on some logic which acts on similarity of the
pixels feature vectors.
Three clustering techniques which will be used in the
experiments are described in the following. The clustering
techniques are K-means (or hard C-means) clustering, fuzzy C-
means clustering and competitive learning networks.
K-means is a representative for a classical and well explored
unsupervised classification algorithm. Its counterpart in the
fuzzy techniques is the fuzzy C-means algorithm which
considers each cluster as a fuzzy set, while a membership
function measures the possibility that each feature vector
belongs to a cluster. Competitive learning networks pick up
concepts of neural processing for unsupervised classification.
The Competitive learning algorithm is based on a type of
artificial neural network that possesses a self-organizing
property called a simple competitive learning network.
2.1 K-means clustering algorithm:
K-means clustering, also known as hard C-means clustering, is
one of the simplest unsupervised classification algorithms. The
procedure follows a simple way to classify the data set through
a certain number of clusters. The algorithm partitions a set of i
vector X; into c classes G;, i=1, … , c, and find a cluster centre
for each class such that an objective function of dissimilarity,
for example a distance measure is minimized. The objective
International
CIE IIS TEE,
function that
is selected as
where
k‚x, EG;
and
X
point x, an
The partition
membership
data point x;!
Hj 7310
The algoritl
following ste
Step 1: Init
typically ach
all of the dat:
Step 2: Det
Equation (2)
Step 3: Com
(1). Stop if
improvemen
threshold.
Step 4: Upd:
The algorith
selected clu
employed m
be found in (
2.2 Fuzzy (
Fuzzy C-m
ISODATA,
point belon
membership
algorithm as
FCM partiti
fuzzy group
an objective
The major c
employs fu:
belong to :
specified by
membership
also the eler
The objecti
Equation (1