The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Voi. XXXVII. Part Bl. Beijing 2008
6
1.2 Outline
The following chapter presents an example implementation of a
multi-camera VIDS. The utilized processing chain is briefly
presented in order to emphasize the need for precise knowledge
of the exterior orientation. Chapter 3 introduces the different
implemented approaches to determining the exterior orientation
of the cameras. This is followed in chapter 4 by an evaluation
of the approaches concerning their accuracy and usability in
respect to the given task. The last chapter summarises and
concludes the results of the evaluation.
2. APPROACH FOR A MULTI-CAMERA VIDS
2.1 Processing Approach
A multi-camera setup has been installed using three cameras to
observe the traffic intersection Rudower Chaussee/
Wegedomstrasse, Berlin (Germany). The cameras cover
overlaid or adjacent observation areas. Thus, the same road user
can be observed by different cameras from different positions
and angles (Figure 1). Using image processing methods the
objects of interest can be found in the image data.
I
Figure 1. Visualisation of the multi-camera-setup
In order to enable the tracking and fusion of detected objects in
the observation area the image coordinates of these objects are
converted into a common world coordinate system. In case of
poor quality of the orientation parameters, the same objects are
observed from different positions. To avoid misidentification of
objects derived from different cameras, a high precision
transformation of their image coordinates into the object space
coordinates is required. Therefore, a very exact calibration
(interior orientation) as well as knowledge of the position and
the view direction (exterior orientation) of the cameras is
necessary.
The approach presented here can be separated into three main
steps. Firstly, all moving objects have to be extracted from each
frame of the video sequence. Next, these traffic objects have to
be projected onto a geo-referenced world plane. Afterwards,
these objects are tracked and associated to trajectories. This can
be utilized to assess comprehensive traffic parameters and to
characterize trajectories of individual traffic participants. These
steps are described more precisely below.
2.2 Video Acquisition and Object Detection
In order to receive reliable and reproducible results, compact
digital industrial cameras with standard interfaces and protocols
(IEEE 1394) are used.
To extract moving objects from an image sequence the image
processing library OpenCV was utilized. The algorithm is based
on a background estimator, which adapts to the variable
background and extracts the desired traffic relevant objects. The
extracted objects are then grouped using a cluster analysis
combined with additional filters to avoid object splitting by
infrastructure at intersections and roads.
The dedicated image coordinates as well as additional
parameters like area, volume, color and compactness can be
computed for each extracted traffic object.
2.3 Coordinate Transformation and Camera Calibration
The employed tracking concept is based on extracted objects,
which are geo-referenced to a world coordinate system. This
concept allows the integration or fusion of additional data
sources as long as their observations can be transferred to the
same coordinate system.
Therefore, a transformation between image and world
coordinates is necessary for a multi-camera system. Using the
collinearity equations (1), the world coordinates X, Y, Z can be
derived from the image coordinates x', y':
r u (x'-x 0 ) + r 23 (y'-y 0 )- rii c (1)
v_v , /7 7 ^ r u (x'-X 0 ) + r 22 (y'-y 0 )-r i2 C
1 ~ J 0 + ¿0 ) , , . .
r x2 (x-x 0 ) + r 22 (y-y Q )-r 22 c
where X, Y = world coordinates (to be calculated)
Z = Z-component in world coordinates (to be known)
X 0 , Y 0 , Z 0 = position of the perspective center in
world coordinates
r n , r 12 ,..., r 33 = elements of the rotation matrix
x', y' = uncorrected image coordinates
xo, yo = coordinates of the principle point
c = focal length
The Z-component in world coordinates can be deduced by
appointing a dedicated ground plane. Additional needed input
parameters are the interior and exterior orientation of the
camera. The interior orientation (principal point, focal length
and additional camera distortion) can be determined using a
well known lab test field. The 10 parameter Brown camera
model was used for describing the interior orientation (Brown,
1971). The parameters can be determined by a bundle block
adjustment as described in (Remondino and Fraser, 2006).
In order to calculating the exterior orientation of a camera,
hence determining its location and orientation in a well known
world coordinate system, different approaches can be applied.
An important set of these approaches are presented and
evaluated in the following chapters.
2.4 Tracking and Trajectory Creation
The tracking algorithm is supposed to provide object data
information combined in a so-called state vector with respect to
time. The state of an object can be described as position,
velocity and acceleration in X-, Y- and Z-direction. Features
like form, size and color can be added. The first task is the
object identification in a video sequence by its predicted state