Full text: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing (Part B5-2)

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B5. Beijing 2008 
Segmentation of laser scans offers a slightly different problem, 
as the data usually defines the geometric characterization of the 
scanned objects. Therefore, the interest is usually in the 
primitive extraction, and mostly in planar elements, e.g., Dold 
and Brenner (2006) for terrestrial scans and Vossleman and 
Dijkman, (2001) for aerial scans. For terrestrial scans Gorte 
(2007) presented a method for extracting planar faces using 
panoramic representation of the range data. Segmentation into a 
more general class of well-defined primitives, e.g., planes, 
cylinders, or spheres, is presented in Rabanni (2006). While 
being useful for reverse engineering practices it cannot be easily 
extended into general scenes. 
Since most scenes are cluttered and contain entities of various 
shapes and forms, among which some are structured but others 
are not, approaching the segmentation problem by seeking 
consistency along a single cue is likely to provide partial results. 
Additionally, while some entities may be characterized by 
geometric properties, others are more distinguishable by their 
color content. Those realizations suggest that segmenting the 
data using multiple cues and integrating data source have the 
potential of providing richer descriptive information, and have 
better prospects for subsequent interpretation of the data. We 
present in this paper a segmentation model for terrestrial laser 
scanning data including range and image data while using 
multiple cues. We study how segments are defined when those 
sources should be merged together, how those sources should be 
integrated in a meaningful way, and ultimately how the added 
value of combining the individual sources can be brought into 
an integrated segmentation. Results of the proposed model show 
that better results than what is obtained by the individual 
segmentations can be achieved. 
The integration of different information sources requires 
securing their co-alignment, and association. The first aspect 
refers to establishing the relative transformation between the 
two sensors. The second suggests that in order to incorporate 
the interpretation of the two data sources, both have to refer to 
the same information unit. Considering the fact that images are 
a 2D projection of 3D space, whereas laser data is three 
dimensional, their mode of integration is not immediate. 
the scanner and the camera frames (the red and the blue 
coordinate systems in the figure respectively) and t the 
translation vector (Hartley and Zisserman, 2003). 
Figure 1. Reference frames of the scanning system with a 
mounted camera. 
The projection matrix defines the image-to-scanner 
transformation and so allows linking the color content to the 3D 
laser points. While this transformation results in a loss of image 
content due to changes in resolution, it allows processing both 
information sources in a single reference frame and is therefore 
2.2 Data Representation 
3D point clouds are difficult to process due to varying scale 
within the data, which leads to an uneven distribution of points 
in 3D space. To alleviate this problem we transform the data 
into a panoramic data representation. As the angular spacing in 
the ranging is fixed (defined by system specifications), 
regularity can be established when the data is transformed into a 
polar representation (Eq. (2)) 
(x,y,z) T = (pcos6cos(p,pc,os6sm(p,pim6) T (2) 
2.1 Camera Scanner Co-alignment 
The camera mounted on top of the scanner can be linked to the 
scanner body by finding the transformation between the two 
frames shown in Figure 1. Such relation involves three offset 
parameters and three angular parameters. This relation can also 
be formulated via the projection matrix P. With P a 3x4 matrix 
that represents the relation between world 3D point (X) and 
image 2D point (x) in homogeneous coordinates. Compared to 
the six standard boresighting pose parameters, the added 
parameters (five in all) will account to intrinsic camera 
parameters. The projection matrix can be formulated as follows: 
* = KR[l|-t]Z = PX (i) 
with x, y and z the Euclidian coordinates of a point, 6 and <p are 
the latitudinal and longitudinal coordinates of the firing 
direction respectively, and p is the measured range. When 
transformed, the scan will form a panoramic range image in 
which ranges are "intensity" measures. Figure 2a shows range 
data in the form of an image where the x axis represents the <p 
value, tpe(0,2n\, and the y axis represents the 0 value, de{- 
n/4,n/4]. The range image offers a compact, lossless, 
representation, but more importantly, makes data manipulations 
(e.g., derivative computation and convolution-like operations) 
simpler and easier to perform. Due to the convenience in data 
processing that this representation offers, all input channels are 
transformed into it. 
2.3 Channel selection 
f x and f y are the focal lengths in the x and y directions 
respectively, s is the skew value, x 0 and y 0 are the offsets with 
respect to the two image axes. R is the rotation matrix between 
As noted, different cues can be used to segment the data. These 
should feature attributes that can characterize the different 
elements of interest or supplement the information derived by 
other cues. For the segmentation, three cues are introduced. The 
first is the range content, namely the "intensity" value in the 
range panorama, the second is the surface normals, and the third

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.