Full text: Proceedings (Part B3b-2)

663 
REAL-TIME ORIENTATION OF A PTZ-CAMERA BASED ON PEDESTRIAN 
DETECTION IN VIDEO DATA OF WIDE AND COMPLEX SCENES 
T. Hoedl *, D. Brandt, U. Soergel, M. Wiggenhagen 
IPI, Institute of Photogrammetry and Geoinformation, Leibniz Universitaet Hannover, Germany 
- (hoedl, soergel, wiggenhagen)@ipi.uni-hannover.de 
Intercommission Working Group III/V 
KEYWORDS: Computer Vision, Detection, Close Range Photogrammetry, Absolute Orientation, Urban Planning, Tracking, 
Multisensor, Real-time 
ABSTRACT: 
Object detection and tracking is the basis for many applications in surveillance and activity recognition. Unfortunately the utilized 
cameras for the observation of wide scenes are mostly not sufficient for detailed information about the observed objects. We present 
a two-camera-system for pedestrian detection in wide and complex scenes with the opportunity to achieve detailed information 
about the detected individuals. The first sensor is a static video camera with fixed interior and exterior orientation, which observes 
the complete scene. Pedestrian detection and tracking is realized in the video stream of this camera. The second component is a 
single-frame PTZ (pan / tilt / zoom) camera of higher geometric resolution, which enables detailed views of objects of interest 
within the complete scene. For this reason the orientation of the PTZ-camera has to be adjusted to the position of a detected 
pedestrian in real-time in order to caption a high-resolution image of the person. This image is stored along with time and position 
stamps. In post-processing the pedestrian can be interactively classified by a human operator. Because the operator is only 
confronted with high-resolution images of stand-alone persons, this classification is very reliable, economic and user friendly. 
1. INTRODUCTION 
1.1 Motivation 
The work presented here is embedded in the framework of an 
interdisciplinary research project aiming at the assessment of 
the quality of shop-locations in inner cities. In this context the 
number, the behaviour (e.g., walking speed and staying periods), 
and the kind (e.g., in terms of gender and age) of pedestrians 
passing by are crucial issues. In general there are different 
options for achieving the desired information. On the one hand 
sensors are feasible, which are carried by persons and which 
deliver their position based on existing infrastructure (mobile 
phones, GPS, RFID, Bluetooth). On the other hand such 
information can be derived entirely from observations from 
outside (cameras). To be independent from active cooperation 
of the individuals and to be able to collect information about all 
pedestrians, cameras were chosen as the appropriate sensors for 
this project. The task requires both the surveillance of a large 
and complex scene and at the same time the need to gather 
high-resolution data of individuals, which can hardly be 
fulfilled by a single camera system. Hence, a two-camera- 
system set-up is used in this approach. 
The first one, the observation camera, is a static video camera 
with fixed interior and exterior orientation. Pedestrian detection 
and tracking must occur in real time in the video stream of this 
camera. The positions of the detected individuals in object 
space are passed to the second camera. This camera is a PTZ 
(pan / tilt / zoom) camera of higher geometric resolution, which 
enables to focus on objects of interest within the complete scene. 
Hence a detailed analysis of the individuals is possible. 
1.2 Related Work 
Due to the broad range of applications (surveillance, activity 
recognition or human-computer-interaction) human motion 
analysis in video sequences has become one of the most active 
fields in computer vision in recent years. Latest surveys of the 
numerous publications were issued by Moeslund et al. (2006) 
and Yilmaz et al. (2006). 
One focus of research is automatic detection and tracking of 
humans in uncontrolled outdoor environments. The tracking of 
articulated objects such as human bodies is much more complex, 
than the tracking of solid objects, as e.g., cars, due to the fact 
that the relation of the limbs changes by time. Nevertheless 
these approaches show already promising results, especially for 
simple scenes populated by only a few individuals. 
The initial step in many approaches is background subtraction. 
For many years background subtraction was only used for 
controlled indoor environments, but with the adaptive Mixture 
of Gaussian (MoG) method by Stauffer & Grimson (1999) it 
also became a standard for outdoor environments. Recent 
advances in background subtraction, which are mostly based on 
the MoG-Algorithm, deal with minimizing false positives or 
negatives, for example due to shadows, or background updating. 
Moeslund et al. (2006) categorize approaches for object 
detection based on the segmentation methods: motion, 
appearance, shape and depth-based. Any use of just one of these 
methods is only successful to a certain point in complex scenes. 
For this reason newer approaches combine several segmentation 
methods, e.g., Viola et al. (2005) combine motion and 
* Corresponding author
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.