Full text: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing (Part B5-2)

933 
gesture 
Ca ®era__ a 
1 bam 
PP. 
Ia yake, ^ 
ai >d rescue. 
№. m 
52 №. pp. 
oving face 
of IEEE 
tem, Iasi, 
V invariant 
-2 cantera. 
07 Range 
ations for 
ieometrics 
vol. 6491, 
57 (3), pp. 
r Science, 
ment: an 
ermany. 
Kamera 
athematic 
RANGE IMAGE SEQUENCE ANALYSIS BY 2.5-D LEAST SQUARES TRACKING WITH 
VARIANCE COMPONENT ESTIMATION AND ROBUST VARIANCE COVARIANCE 
MATRIX ESTIMATION 
Patrick Westfeld 3 ’* and René Hempel 
Technische Universität Dresden, D-01062 Dresden, Germany 
institute of Photogrammetry and Remote Sensing (IPF), patrick.westfeld@tu-dresden.de, http://www.tu-dresden.de/ipf/photo 
b Faculty of Education, rene.hempel@tu-dresden.de, http://tu-dresden.de/die_tu_dresden/fakultaeten/erzw/erzwiae/ewwm 
KEY WORDS: Range Imaging, Least Squares Tracking, Variance Component Estimation, Robust Variance Covariance Matrix 
ABSTRACT: 
In this article, a range image sequence tracking approach is proposed, which combines 3-D camera intensity and range observations 
in an integrated geometric transformation model. Based on 2-D least squares matching, a closed solution for intensity and range 
observations has been developed. By combining complementary information, an increase in accuracy and reliability can be achieved. 
The weighting of the two different types of observations with a-priori unknown quality is performed by variance component estimation. 
To fulfill the requirements of robust variance covariance matrix estimation in statistical context, alternative approaches for variance 
covariance matrix calculation are proposed and evaluated. To verify its applicability, reliability and accuracy potential, the introduced 
2.5-D least squares tracking technique has been evaluated by several series of experiments in the field of human motion and interaction 
measurement. 
1 INTRODUCTION AND MOTIVATION 
Conventional stereo-photogrammetric procedures generate, de 
pending on the sensors used, object space maps with high spatio- 
temporal resolution. The main drawbacks are the recording con 
figuration of at least two cameras, synchronized and oriented to 
each other, and the data processing, which is highly complex due 
to spatial and temporal feature matching. 
duced by Isard and Blake (1998) and extended for tracking mul 
tiple objects in RIM sequences by Koller-Meier (2000) - into a 
RIM tracking process is described in Kahlmann et al. (2007). 
Range imaging (RIM) cameras (3-D cameras) based on photonic 
mixer devices (PMD; Schwarte, 1997) or comparable principles 
offer an interesting monocular alternative for photogrammetric 
3-D data acquisition. The use of modulation techniques and com 
bined CCD/CMOS technology provides simultaneous gray value 
and distance measurements of the scene in each pixel of the sen 
sor. With frame rates up to 50 Hz, 3-D cameras are well suited 
for motion capture in fields such as human or robot (inter-)action 
analysis. 
The above reviewed RIM tracking approaches are based on ba 
sic image analysis functions (e.g. thresholding, segmentation, 
computation of point cloud centroid) or extended matching pro 
cedures using motion and measurement models (e.g. CONDEN 
SATION algorithm, Kalman filtering) applied to the range data. 
In this article, a RIM sequence tracking approach (2.5-D least 
squares tracking; LST) is proposed, which combines RIM inten 
sity and range observations in an integrated geometric transfor 
mation model. Based on 2-D least squares matching (LSM), a 
closed solution for intensity and range observations has been de 
veloped. In contrast to motion model techniques, intensity obser 
vations are also included into the least squares (LS) adjustment. 
By adding complementary information, an increase in accuracy 
and reliability can be expected. 
Several approaches to (semi-)automatic RIM sequence analysis 
have been shown: Goktiirk and Tomasi (2004) introduced a RIM 
head-tracking algorithm. In a training stage, a depth signature 
(representative signature for head location) is calculated by iden 
tifying the probands’ heads on each frame interactively. In a 
tracking stage, the depth-signature of each frame is compared 
against the training signatures. The best match can be identified 
by a correlation metric and represents the location of the object of 
interest. Kahlmann and Ingensand (2006) described the usability 
of the RIM camera SwissRanger SR-3000 for surveillance sys 
tems. Moving persons within an indoor scene could be detected 
by RIM thresholding and pixel clustering. Gesture recognition 
based on motion detection by double difference range images and 
3-D shape matching with 3-D shape contexts has been presented 
by Holte and Moeslund (2007). Breuer et al. (2007) recognized 
hand movements (location and orientation) by principle compo 
nent analysis (PCA) applied on RIM data. In the further course of 
analysis, they fitted an articulated model to reconstruct the hands. 
The centroid of a cluster represents the persons position for the 
corresponding frame. The implementation of the CONDENSA 
TION algorithm (conditional density propagation) - first intro- 
2 SENSOR AND DATA 
RIM sensors (Figure 1) allow the simultaneous acquisition of in 
tensity and range images of - in principal - any scene (Figure 
2). In the field of RIM sensor technology, 3-D cameras are cur 
rently available with a sensor size of up to 25,000 pixels and a 
frame rate of up to 50 Hz. Based on a phase-measuring time-of- 
flight (TOF) principle, the camera is able to measure distances 
for each pixel in addition to the gray value information (Oggier 
et al., 2004). As a result, a spatiotemporal resolved represen 
tation of the object space is given in the form of intensity im 
ages and range maps. The calculation of 3-D coordinates is per 
formed on-chip. Image coordinates as well as range information 
are transformed into Cartesian coordinates using the relationship 
between image and object space as described in Kahlmann and 
Ingensand (2006). Several assumptions are implied, which have 
to be proven by suitable photogrammetric calibration techniques 
(Kahlmann et al., 2006; Westfeld, 2007a). 
Corresponding author. 
Advantages of this new 3-D mapping technology are the genera 
tion of 3-D data on a discreet raster without stereo compilation, 
the recording of motion sequences and the marginal dimension.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.