Full text: Papers accepted on the basis of peer-reviewed full manuscripts (Part A)

In: Paparoditis N., Pieirot-Deseilligny M.. Mallet C.. Tournaire O. (Eds). 1APRS. Vol. XXXVIII. Part ЗА - Saint-Mandé, France. September 1-3. 2010 
PEOPLE TRACKING AND TRAJECTORY INTERPRETATION 
IN AERIAL IMAGE SEQUENCES 
F. Burkert 3 '*, F. Schmidt b , M. Butenuth a , S. Hinz b 
a Technische Universität München, Remote Sensing Technology, 80333 München, Germany 
(florian.burkert, matthias.butenuth)@bv.tum.de 
b Karlsruher Institut fur Technologie, Institut für Photogrammetrie und Fernerkundung, 76131 Karlsruhe, Germany 
(florian.Schmidt, stefan.hinz)@kit.edu 
Commission III, WG II1/5 
KEY WORDS: People tracking, people trajectories, event detection, aerial image sequences 
ABSTRACT: 
Monitoring the behavior of people in complex environments has gained much attention over the past years. Most of the current 
approaches rely on video cameras mounted on buildings or pylons and people are detected and tracked in these video streams. The 
presented approach is intended to complement this work. The monitoring of people is based on aerial image sequences derived with 
camera systems mounted on aircrafts, helicopters or airships. This imagery is characterized by a very large coverage providing the 
opportunity to analyze the distribution of people over a large field of view. The approach shows first results on automatic detection 
and tracking of people from image sequences. In addition, the derived trajectories of the people are automatically interpreted to 
reason about the behavior and to detect exceptional events. 
1. INTRODUCTION 
Monitoring the behavior of people in crowded scenes and in 
complex environments has gained much attention over the past 
years. The increasing number of big events like conceits, 
festivals, sport events and religious meetings as the pope’s visit 
leads to a growing interest in monitoring crowded areas. In this 
paper, a new approach for detecting and tracking people from 
aerial image sequences is presented. In addition to delineating 
motion trajectories, the behavior of the people is interpreted to 
detect exceptional events such as panic situations or brawls. 
A typical feature of current approaches is the utilization of 
video cameras mounted on buildings to detect and track people 
in video streams. Pioneering work on tracking human 
individuals in terrestrial image sequences can be found, e.g., in 
(Rohr, 1994; Moeslund & Granuni, 2001). While this work 
focuses on motion capture of an isolated human, first attempts 
to analyze more crowded scenes are described in (Rosales & 
Scarloff, 1999; McKenna et al. 2000). Such relatively early 
tracking systems have been extended by approaches integrating 
the interaction of 3D geometry, 3D trajectories or even 
intentional behavior between individuals (Zhao & Nevada, 
2004; Yu & Wu, 2004; Nillius et al., 2006; Zhao et al., 2008). 
Advanced approaches, based on so-called sensor networks, are 
able to hand-over tracked objects to adjacent cameras in case 
they leave the current field of view achieving a quite 
comprehensive analysis on the monitored scene. The work of 
(Kang et al.. 2003) exemplifies this kind of approaches. Instead 
of networks of cameras, moving platforms like unmanned 
airborne vehicles (UAVs) can be utilized, too, as e.g. presented 
in (Davis et al., 2000). An overview on the research of crowd 
modeling and analysis including all stages of a visual 
surveillance is given in (Hu et al., 2004; Zhan et al., 2008). 
An important aspect of tracking a large number of people, as 
e.g. shown in (Rodriguez et al., 2009), includes the potential to 
not only analyze individual trajectories but also to learn typical 
interactions between trajectories (Scovanner & Tappen, 2009). 
Hence, event detection has been an intensely investigated field 
of research in the last decade. A framework using two modular 
blocks to detect and analyze events in airborne video streams is 
presented in the work of (Medioni et al., 2001). The first 
module detects and tracks moving objects in a video stream, 
whereas the second module employs the derived trajectories to 
recognize predefined scenarios. A further event recognition 
system is based on two consecutive modules, namely a tracking 
and an event analysis step, in which complex events are 
recognized using Bayesian and logical methods (Hongeng et al., 
2004). Video streams from close range surveillance cameras are 
used to detect events focusing on interactions between few 
persons. Further methods exemplify the emphasis on research in 
surveillance issues, as the scanning of video streams for unusual 
events (Breitenstein et al., 2009; Mehran et al., 2009). 
Additional related work in the field of people tracking and 
event detection is based on seminal research in crowd analysis 
and simulation (Helbing and Molnar, 1995; Helbing et al., 
2002). Observed collective phenomena in moving crowds, like 
lane formations in corridors, have successfully been simulated 
using a social force model (SFM). The SFM considers 
interactions among pedestrians and between pedestrians and 
obstacles, resulting in a certain moving direction for each 
individual. 
The approach presented in this paper is aimed to complement 
the above work. The monitoring of people is based on aerial 
camera systems mounted on aircrafts, UAVs, helicopters or 
airships. The provided image sequences cover a large area of 
view allowing for the analysis of density, distribution and 
motion behavior of people. Yet, as the frame rate of such image 
sequences is usually much lower compared to video streams 
(only some Hz), more sophisticated tracking approaches need to 
be employed. Moreover, the interpretation of scenarios in such 
large scale image sequences needs to comprise an exceeding 
number of moving objects compared to existing event detection 
systems. Thus, the intention of the approach is to define a 
broader spectrum of identifiable scenarios instead of simply 
alerting a general abnormal event within a monitored area.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.