CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
42
tions) are observed and evaluated. A common aim is to de
scribe the observed data and to detect atypical or threatening
events.
Other areas of situation analysis besides driver assistance
(Reichardt 1995) may include traffic situation representation,
surveillance applications (Beynon et al. 2003), sport video
analysis or even customer tracking for marketing analysis
(Leykin et al. 2005).
(Kumar et al. 2005) developed a rule-based framework for
behavior and activity detection in traffic videos obtained
from stationary video cameras. For behavior recognition,
interactions between two or more mobile targets as well as
between targets and stationary objects in the environment
have been considered. The approach is based on sets of pre
defined behavior scenarios, which need to be analyzed in
different contexts.
(Yung et al. 2001) demonstrate a novel method for automatic
red light runner detection. It extracts the state of the traffic
lights and vehicle motions from video recordings.
1.3 Image and Trajectory Processing
The cameras deployed cover overlaid or adjacent observation
areas. With it, the same road user can be observed using dif
ferent cameras from different view positions and angles. The
traffic objects in the image data can be detected using image
processing methods.
The image coordinates of these objects are converted to a
common world coordinate system in order to enable the
tracking and fusion of the detected objects of the respective
observation area. High precision in coordinate transformation
of the image into the object space is required to avoid mis-
identification of the same objects that were derived from
different camera positions. Therefore, an exact calibration
(interior orientation) as well as knowledge of the position and
view direction (exterior orientation) of the camera is neces
sary.
Since the camera positions are given in absolute geographical
coordinates, the detected objects are also provided in world
coordinates.
The approach is subdivided into the following steps. Firstly,
all moving objects have to be extracted from each frame of
the video sequences. Secondly, these traffic objects have to
be projected onto a geo-referenced world plane. Afterwards,
these objects are tracked and associated to trajectories. One
can now utilize the derived information to assess comprehen
sive traffic parameters and to characterize trajectories of
individual traffic participants.
1.4 Scenario
The scenario has been tested at the intersection Rudower
Chaussee / Wegedomstrasse, Berlin (Germany) by camera
observation using three cameras mounted at a comer building
at approximately 18 meters height. The observed area has an
extent of about 100x100 m and contains a T-section. Figure 1
shows example trajectories derived from images taken from
three different positions. The background image is an ortho
photo, derived from airborne images.
Figure 1. Orthophoto with example trajectories
The aim is the description of the trajectories by functions
with a limited number of parameters. Source destination
matrices could be determined at these crossroads through
such parameters without any further effort. A classification
approach shall be used here.
2. PROCESSING APPROACH
2.1 Video Acquisition and Object Detection
In order to receive reliable and reproducible results, only
compact digital industrial cameras with standard interfaces
and protocols (e.g. IEEE 1394, Ethernet) are deployed.
Different image processing libraries or programs (e.g.
OpenCV or HALCON) are available to extract moving ob
jects from an image sequence. We used a special algorithm
for background estimation, which adapts to the variable
background and extracts the desired objects. The dedicated
image coordinates as well as additional parameters like size
and area were computed for each extracted traffic object.
2.2 Sensor Orientation
The existing tracking concept is based on extracted objects,
which are geo-referenced to a world coordinate system. This
concept allows the integration or fusion of additional data
sources. The transformation between image and world coor
dinates is based on collinearity equations. The Z-component
in world coordinates is deduced by appointing a dedicated
ground plane. An alternative is the use of a height profile.
Additionally needed input parameters are the interior and
exterior orientation of the camera. For the interior orientation
(principal point, focal length and additional camera distor
tion) of the cameras the 10 parameter Brown distortion
model (Brown 1971) was used. The parameters are being
determined by a bundle block adjustment.
Calculating the exterior orientation of a camera (location of
the projection centre and view direction) in a well known
world coordinate system is based on previously GPS meas
ured ground control points (GCPs). The accuracy of the
points is better than 5 cm in position and hight. The orienta
tion is deduced through these coordinates using DLT and the
spatial resection algorithm (Luhmann 2006).