In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3A/V4 — Paris, France, 3-4 September, 2009
Here, the membership degree h for every instance x, to
cluster j, is produced based on the Euclidean distance d:
d(x n /jj)
where
fj. e <g d = cluster center.
At each iteration, the boost-clustering algorithm clusters data
points that were hard to cluster in previous iterations. An
important issue to be addressed here and that is the cluster
correspondence problem between the clustering results of
different iterations (Frossyniotis et al., 2004).
2.2 Feature Extraction
The first step in every clustering process is to extract the feature
image bands. These features must contain useful information to
discriminate between different regions of the surface. In our
experiment we have used two types of features:
- The filtered first pulse range image using gradient
- Opening filtered last pulse range image
By our experiments, these two features have enough
infonnation to extract our objects of interest.
The normalized difference of the first and last pulse range
images (NDDI) is usually used as the major feature band for
discrimination of the vegetation pixels from the others.
However, building boundaries also show a large value in this
image feature. It is because when the laser beam hits the
exposed surface it will have a footprint with a size in the range
of 15-30 cm or more. So, if the laser beam hits the edge of a
building, then part of the beam footprint will be reflected from
the top roof of the building and the other part might reach the
ground (Alharthy and Bethel, 2002). The high gradient response
on building edges was utilized to filter out the NDDI image
using equation 6.
NDDI =
FPR-LPR
FPR + LPR
(6)
if gradient > threshold, then (FPR-LPR) = 0.0
where
FPR = first-pulse range image data
LPR = last-pulse range image data
The gradient of an image is calculated using equation 7:
G(image) = ^G x (image) 2 + G y (image) 2
where
G x = gradient operators in x direction.
G y = gradient operators in y direction.
(7)
The morphology Opening operator is utilized to filter elevation
space. This operator with a flat structuring element eliminates
the trend surface of the terrain. The main problem of using this
filter is to define the proper size of the structuring element
which should be big enough to cover all 3D objects which can
be found on the terrain surface. The Opening operation is
defined by:
AoB = (AQB)®B (8)
where
A ® B - jx: | {¿ x n^)c= a] (9)
is the morphological Dilation of set A with structure element B.
And
AQB = {x | B x <z A) (10)
is the morphological Erosion of set A with structure element B
(Gonzalez and Woods, 2006).
2.3 Quality Analysis
Comparative studies on clustering algorithms are difficult due
to lack of universally agreed upon quantitative performance
evaluation measures (Jain et al., 1999). Many similar works in
the clustering area use the classification error as the final
quality measurement; so in this research, we adopt a similar
approach.
Here, we use error matrix as main evaluation method of
interpretation result. Each column of this matrix indicates the
instances in a predicted class. Each row represents the instances
in an actual class. All the diagonal variants refer to the correct
interpreted numbers of different classes found in reality. Some
measures can be derived from the error matrix, such as producer
accuracy, user accuracy and overall accuracy (Liu et al, 2007).
Producer Accuracy (PA) is the probability that a sampled unit
in the image is in that particular class. User Accuracy (UA) is
the probability that a certain reference class has also been
labelled that class. Producer accuracy and user accuracy
measures of each class indicate the interpretability of each
feature class. We can see the producer accuracy and user
accuracy of all the classes in the measures of “producer overall
accuracy” and “user overall accuracy”.
PA,
f N i i}
( N i i Ì
—Li- * 100% .
, UA,= —
1 N .i 1
KJ
: ! 00%
(ID
where
N. = (i,j)th entry in confusion matrix
N j = the sum of all columns for row i
N is the sum of all rows for column i.
“Overall accuracy” considers all the producer accuracy and user
accuracy of all the feature classes. Overall accuracy yields one
number of the whole error matrix. It‘s the sum of correctly
classified samples divided by the total sample number from user
set and reference set (Liu et al, 2007).
k
± N u
OA = -7——— ^ * 100% ( 12 )
k /=i /=1
Another factor can be also extracted from confusion matrix to
evaluate the quality of classification algorithms, which is K-