International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
A STEP TOWARDS DYNAMIC SCENE ANALYSIS WITH ACTIVE MULTI-VIEW RANGE
IMAGING SYSTEMS
Martin Weinmann and Boris Jutzi
Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT)
Kaiserstr. 12, 76128 Karlsruhe, Germany
(martin.weinmann, boris.jutzi ) € kit.edu
Commission III, WG III/5
KEY WORDS: LIDAR, Multisensor, Point Cloud, Imagery, Automation, Close Range, Dynamic
ABSTRACT:
Obtaining an appropriate 3D description of the local environment remains a challenging task in photogrammetric research. As terrestrial
laser scanners (TLSs) perform a highly accurate, but time-dependent spatial scanning of the local environment, they are only suited for
capturing static scenes. In contrast, new types of active sensors provide the possibility of simultaneously capturing range and intensity
information by images with a single measurement, and the high frame rate also allows for capturing dynamic scenes. However, due to
the limited field of view, one observation is not sufficient to obtain a full scene coverage and therefore, typically, multiple observations
are collected from different locations. This can be achieved by either placing several fixed sensors at different known locations or by
using a moving sensor. In the latter case, the relation between different observations has to be estimated by using information extracted
from the captured data and then, a limited field of view may lead to problems if there are too many moving objects within it. Hence,
a moving sensor platform with multiple and coupled sensor devices offers the advantages of an extended field of view which results
in a stabilized pose estimation, an improved registration of the recorded point clouds and an improved reconstruction of the scene.
In this paper, a new experimental setup for investigating the potentials of such multi-view range imaging systems is presented which
consists of a moving cable car equipped with two synchronized range imaging devices. The presented setup allows for monitoring
in low altitudes and it is suitable for getting dynamic observations which might arise from moving cars or from moving pedestrians.
Relying on both 3D geometry and 2D imagery, a reliable and fully automatic approach for co-registration of captured point cloud data
is presented which is essential for a high quality of all subsequent tasks. The approach involves using sparse point clouds as well as a
new measure derived from the respective point quality. Additionally, an extension of this approach is presented for detecting special
objects and, finally, decoupling sensor and object motion in order to improve the registration process. The results indicate that the
proposed setup offers new possibilities for applications such as surveillance, scene reconstruction or scene interpretation.
1 INTRODUCTION
An appropriate 3D description of the local environment is repre-
sented in the form of point clouds consisting of a large number of
measured 3D points and, optionally, different attributes for each
point. Such point clouds can directly be acquired with different
scanning devices such as terrestrial laser scanners (TLSs), time-
of-flight (ToF) cameras or devices based on the use of structured
light. However, a single scan often is not sufficient and hence,
multiple scans have to be acquired from different locations in or-
der to get a full scene coverage. As each captured point cloud
represents 3D information about the local area only with respect
to a local coordinate frame, a basic task for many applications
consists of a point cloud registration. This process serves for es-
timating the transformation parameters between different point
clouds and transforming all point clouds into a common coordi-
nate frame. Existing techniques for point cloud registration rely
on
e 3D geometry,
* 3D geometry and the respective 2D representation as range
image and
* 3D geometry and the corresponding 2D representation of
intensity values.
Standard approaches involving only the spatial 3D information
for calculating the transformation parameters between two par-
tially overlapping point clouds are based on the Iterative Closest
Point (ICP) algorithm (Besl and McKay, 1992) and its variants
(Rusinkiewicz and Levoy, 2001). Iteratively minimizing the dif-
ference between two point clouds however shows a high compu-
tational effort for large numbers of points. Hence, other regis-
tration approaches are based on information extracted from the
point clouds. This information may for instance be derived from
the distribution of the points within each point cloud by using the
normal distributions transform (NDT) either on 2D scan slices
(Brenner et al., 2008) or in 3D (Magnusson et al., 2007). If the
presence of regular surfaces can be assumed in the local environ-
ment, various types of geometric features are likely to occur, e.g.
planes, spheres and cylinders. These features can directly be ex-
tracted from the point clouds and strongly support the registration
process (Brenner et al., 2008; Pathak et al., 2010; Rabbani et al.,
2007). In cluttered scenes, descriptors representing local surface
patches are more appropriate. Such descriptors may be derived
from geometric curvature or normal vectors of the local surface
(Bae and Lichti, 2008).
As the scans are acquired on a regular grid resulting from a cylin-
drical or spherical projection, the spatial 3D information can also
be represented as range image. This range image provides addi-
tional features such as distinctive feature points which strongly
support the registration process (Barnea and Filin, 2008; Steder
et al., 2010).
Currently, most of the scanning devices can not only capture
3D information but also either co-registered camera images or
panoramic reflectance images representing the respective energy
of the backscattered laser light. The additional information typ-