Full text: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing (Part B1-1)

5 
EVALUATION OF CAMERA CALIBRATION APPROACHES FOR VIDEO IMAGE 
DETECTION SYSTEMS 
S. Bauer, A. Luber, R. Reulke 
German Aerospace Center, Institute for Transportation Systems, Rutherfordstr. 2, 12489 Berlin, Germany 
(sascha.bauer, andreas.luber, ralf.reulke)@dlr.de 
Commission I, WG 1/1 
KEY WORDS: Calibration, Multisensor, Fusion, Orientation, Camera, Georeferencing 
ABSTRACT: 
In modem traffic management Video Image Detection Systems (VIDS) are becoming increasingly important as traffic sensors. They 
are getting more affordable and don’t require any road construction like commonly used induction loops. Furthermore, due to the 
fact that they are able to monitor a wide area, they potentially offer the derivation of a whole new set of traffic parameters. Good 
examples are the derivation of source-destination relations, queue-length, travel-times or general event detection like untypical 
movements, accidents, blockages and congestions. Additionally, by using more than one camera the surveillance area can be 
enlarged or the detection accuracy can be increased due to redundancy of observations. However, in order to take advantage of a 
multiple camera system, the observations from different cameras have to be fused. In the setup that will be presented a geometric 
fusion is proposed by projecting the observations into a combined geo-referenced coordinate frame. The basic requirement for this 
transformation is the knowledge of the interior and exterior orientation of every camera. Three different approaches for determining 
the exterior orientation have been implemented, namely a Newton method, a least squares adjustment based on ground control points 
and a method based on line features. Furthermore, direct linear transformation and minimum space resection are applied to calculate 
initial estimates. These algorithms are subject to an in depth evaluation in respect to their application as a traffic monitoring sensor. 
1. INTRODUCTION 
1.1 Motivation 
The task of modem traffic management is to utilize the limited 
resources in transportation infrastructure as efficient as possible. 
In order to meet the requirements of this challenging task, a 
precise and up-to-date knowledge of the traffic situation is 
needed. Nowadays, a wide variety of traffic sensors that are 
commercially available can be applied as a source of such 
information (Klein et al., 1997). Induction loops and microwave 
radar systems are the most commonly used detectors. They 
typically provide traffic parameters like presence, speed and 
length of a vehicle as well as time gap between vehicles. Since 
these parameters offer only a rudimentary description of the 
traffic situation a great deal of research has concentrated on 
new detectors capable of providing more complex traffic 
parameters. 
Video image detection systems (VIDS) constitute an important 
group of traffic detectors (Michalopoulos, 1991; Wigan, 1992; 
Kastinaki et al., 2003). In contrast to the typical focus on a 
single location they are able to monitor a wide area, and hence, 
they potentially provide a whole new range of traffic 
parameters (Datta et al., 2000; Harlow and Wang, 2001; 
Setchell and Dagless, 2001; Yung and Lai, 2001). Such 
parameters can be source-destination relations, queue length, 
travel times or general event detection like untypical 
movements, accidents, blockages and congestions. Furthermore, 
a combination of several cameras is often beneficial to enlarge 
the surveillance area or to increase the detection accuracy due 
to redundancy of observations. However, in order to take 
advantage of a multiple camera system, the observations from 
different cameras have to be fused in a way that allows their 
subsequent comparison. 
Information can be fused on different levels. Combining 
observations on a geometric level is a common approach for 
object detection by multi camera systems. This is done by 
projecting the observations into a combined coordinate frame. 
In general, a geo-referenced frame is preferred (Ernst et al., 
2005). The knowledge of the interior and exterior orientation of 
every camera is an essential requirement for this transformation. 
The interior orientation can be deduced before camera 
installation using a well know test field. Different strategies can 
be applied to compute the unknown parameters of the exterior 
orientation of the final setup. The following approaches have 
been implemented: 
1. Direct Linear Transformation (DLT) 
2. Space Resection 
for determining initial estimations and 
3. Newton Method 
4. Adjustment using Gauss-Markov and point features 
5. Adjustment using Gauss-Markov and line features 
to calculate the position and orientation of the cameras. These 
algorithms differ in their complexity, ease of use and their 
expected accuracy. The objective of this paper is to give an in- 
depth comparison in respect to these properties. Especially the 
relationship between needed scene information and achieved 
accuracy is highlighted with regard to the requirements of 
modem traffic surveillance.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.