Full text: Papers accepted on the basis of peer-reviewed full manuscripts (Part A)

In: Paparoditis N., Pierrot-Deseilligny M.. Mallet C.. Tournaire O. (Eds), IAPRS. Vol. XXXVIII. Part ЗА - Saint-Mandé, France. September 1-3. 2010 
Image 
Features 
Motion 
Parameters 
Simple 
Events 
Scénario 
100% 
80% 
60% 
• Completeness Correctness ♦" 
40% 
Figure 4. Example for the scenario “waiting for another 
person" consisting of four hierarchical layers. 
3. EXPERIMENTAL RESULTS 
3.1 Test scenario 
For developing and testing the presented new approach, aerial 
image sequences provided by DLR’s 3K multi-head camera 
system are used (Kurz et al., 2007). This system consists of 
three non-metric off-the-shelf cameras, with one camera 
pointing in nadir direction and two in oblique direction. The 
basis for near-realtime mapping is provided with a coupled 
realtime GPS/IMU navigation system which enables accurate 
direct georeferencing. 
The aerial image sequence used in the experiments was 
captured at a soccer match with a few thousand people heading 
for the gates of the stadium. The height of flight was 1500m 
resulting in a ground sampling distance of about 20cm. In spite 
of the low resolution, people can be recognized clearly by their 
long shadow. The camera system has been operating in 
continuous mode which resulted in image sequences with a 
length of 40 frames at a sampling rate of 2 Hz. Every image 
covers an area of approximately 1000m x 600m and with an in 
track overlap of about 90%. For the evaluation a smaller area 
has been selected, completely visible in 16 consecutive frames. 
Figure 6 shows the test area in every third frame of the image 
sequence. 
3 4 5 6 7 8 9 10 11 12 
Number of jointly analysed, consecutive images 
13 14 15 
Figure 5. Comparison of manually tracked persons with the 
results of the algorithm over a sequence of 15 aerial images 
with about 130 persons visible. 
which is not too crowded. Here, 130 persons could be marked 
manually in average through a sequence of 15 frames. It is 
important to know for a correct interpretation of the evaluation 
that the reference data might not be free of errors. Occasionally, 
manually tracked persons merged with others so that their 
position had to be estimated for some frames. In other 
situations, the contrast became too low to define the accurate 
position of a person due to clouds passing by. 
The evaluation results of the detection and tracking algorithms 
are shown in Figure 5. An automatically generated segment is 
considered as a correct detection if the distance between its 
center and the next reference position is within a tolerance 
radius of 3 pixels corresponding to 45cm on the ground. The 
same criterion is applied to evaluate the tracking results. 
Though, in this case every point of a generated trajectory has to 
be close enough to one of the reference trajectories. For the 
evaluation of the tracking results all possible links between two 
up to 15 consecutive frames are compared. Figure 7 visualizes a 
result of detection and tracking in comparison to the reference. 
Averaging the results over all 15 images, the detection module 
has achieved a completeness of 61% and a correctness of 66% 
(cf. Figure 5, length 1). The completeness of the generated 
trajectories increases almost linearly with growing length while 
the completeness drops down quickly. Several reasons are 
possible: one effect still to investigate is the influence of the 
tolerance radius during evaluation. The center of the detected 
segments could be more than 3 pixels away from the manually 
marked position of the head of a person. This can happen when 
the body of one person merges with its shadow to a uniform dot 
due to low contrast, cf. Figure 7 (left). Another effect stems
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.