The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B5. Beijing 2008
747
is the true color channel arriving from images acquired by the
mounted camera. Notably additional (or different) cues can be
formed and added.
Surface normals are computed by
with V, = [dX x , dY x , dZ x ]*, V 2 = [dX 2 , dY 2 , dZ 2 ]* . Differences
are computed between neighboring pixels in the range
panorama. The amplification of noise in the normal
computation and the variations in scale across the scan affect
the quality of the normal values in different levels, where
noisier normals are expected close to the scanner. We reduce
the noise effect by applying an adaptive Gaussian smoothing of
the data as a function of the range. The physical window size, D,
is set to a fixed value, which is then translated into an adaptive
kernel size as a function of the range and scanner angular
resolution A. The window size, d, in image space is given by Eq
(4).
«P)* 4 (4)
pA
The three individual channels can be seen in Figure 2. Figure 2a
shows the range channel with the blue color indicating no-retum
regions that relate both to the sky and to specular points from
which no return arrived. Figure 2b shows the normal directions
(color coded) that are showing monotonicity on the ground and
along the walls while exhibiting variations around trees and
other non-flat or faceted objects. The consistency in the normal
values is a result of the adaptive smoothing process. Figure 2c
shows the projected color points on the range panorama as
achieved via ray tracing. We note that due to some inaccuracies
in the registration and the resolution of the laser data (compared
to the image based one) some tree canopy points receive sky
colors. To eliminate these artifacts from the segmentation, sky
tones are masked and replaced by the closest darker tone. An
alternative approach will segment the individual images in
image space and then assemble them through the forward
projection. In this setup, the assembly (and handling sky
segments) will require treatment.
2.4 Segmentation
The transformation of the data into a panorama allows the use
of common image segmentation procedures for segmenting the
point-cloud. As a segmentation scheme, we use the Mean-Shift
segmentation (Comaniciu and Meer, 2002), a scheme that was
chosen due to its successful results with complex and cluttered
images. Being a non-parametric model, it requires neither model
parameters nor domain knowledge as inputs. The algorithm is
controlled by only two dominant parameters: the sizes of spatial
and the range dimensions of the kernel. The first affects the
spatial neighborhood while the latter affects the permissible
variability within the neighborhood. These two parameters are
physical in a sense.
Generally, the mean-shift clustering, on which the segmentation
process is based, is an iterative procedure, where each data point
is "shifted" towards the centroids of it neighboring data points.
The new value of the point is set as the mean, c ;+/ , by
Z w(Cj-s)s
r = seS(c ^ (5)
C j+1 V , V
2^ w(Cj - s)
seS(Cj )
with w( ) the weight attached to the vector s of the point, and j
the iteration index number. Convergence is reached when the
centroids is no longer updated. The segmentation algorithm
itself is based on a derived filtering scheme beginning with
feature vectors considered a cluster center. Using the update
equation, an iterative convergence process into cluster centers is
initialized. The pixel labels are set to the value of convergence.
Then, neighboring regions sharing common values, up to the
parameter defined for the range, are grouped together into a
segment.
The application of the mean shift segmentation on the
individual channels is shown in Figure 3. Figure 3a shows the
segmentation based on the range, it shows that the patchy
results appear in continuous regions where no meaningful
separation can be identified. Nonetheless, elements like the tree
stems or poles clearly stand out as individual segments. Figure
3b shows the results of the normal based segmentation.
Contrary to the range based segmentation, the ground and the
façades appear here as complete segments. Notice however the
patchiness around unstructured elements as the trees, poles or
the fountain in the front of the scene. Finally, Figure 3c shows
that the color channel managed capturing some of the façades as
complete objects, and vehicles (which are dominant in their
color feature) were extracted. Generally, color exhibits
sensitivity to illumination conditions and shadows, which can
be noticed in the segmentation of the floor, in some of the walls
and the fountain. Notice that poles and traffic signs, which are
expected to be distinct with respect to their surroundings, were
isolated in the color segmentation.
2.5 Integration scheme
When dealing with multi-cue based segmentation as in the
present case, the main challenge is handling the different space
partitioning of the different channels. As an example, the
ground, which ideally would be extracted as a single segment,
will have uniform values in the normals channel while having
large variations in the distance channel (and also uneven
intensity values in the true color channel). Therefore, our aim is
not perform a segmentation that concatenates all channels into a
single cube and performs the segmentation on the augmented
feature vector. Such segmentation will be highly dimensional,
computationally inefficient, and ultimately may lead to over-
% segmentation of the data.
Instead, the integration scheme we follow originates from the
realization that the different channels exhibit different
properties of the data. Consequently, they will provide "good"
segments in some parts of the data and "noisy" ones in other
parts. We segment therefore each channel independently (as the
results in Figure 3 show) and then construct a segmentation that
integrates them, by selecting the better segments from each
channel. We note that in this scheme the addition of other
channels can be accommodated without many modifications.