D'Apuzzo, Nicola
LEAST SQUARES MATCHING TRACKING ALGORITHM
FOR HUMAN BODY MODELING
Nicola D'APUZZO', Ralf PLAENKERS , Pascal FUA"*
"Swiss Federal Institute of Technology, Zurich, Switzerland
Institute of Geodesy and Photogrammetry (IGP)
nicola@ geod.baug.ethz.ch
Swiss Federal Institute of Technology, Lausanne, Switzerland
Computer Graphics Lab (LIG)
Ralf.Plaenkers@epfl.ch, Pascal.Fua@epfi.ch
Commission V Special Interest Working Group on “Animation”
KEYWORDS: Object Tracking, Image Sequences, CCD, Modeling, Animation, Least Squares Matching
ABSTRACT
In this paper we present a method to extract 3-D information of the shape and movement of the human body using video
sequences acquired with three CCD cameras. This work is part of a project aimed at developing a highly automated
system to model most realistically human bodies from video sequences. Our image acquisition system is currently
composed of three synchronized CCD cameras and a frame grabber which acquires a sequence of triplet images. From
the video sequences, we extract two kinds of 3-D information: a three dimensional surface measurement of the visible
body parts for each triplet and 3-D trajectories of points on the body. Our approach to surface measurement is based on
multi-image matching, using the adaptive least squares method. A semi automated matching process determines a dense
set of corresponding points in the triplets, starting from few manually selected seed points. The tracking process is also
based on least squares matching techniques, thus the name LSMTA (Least Squares Matching Tracking Algorithm). The
spatial correspondences between the three images of the different views and the temporal correspondences between
subsequent frames are determined with a least squares matching algorithm. The advantage of this tracking process is
twofold: firstly, it can track natural points, without using markers; secondly, it can also track entire surface parts on the
human body. In the last case, the tracking process is applied to all the points matched in the region of interest. The result
can be seen as a vector field of trajectories (position, velocity and acceleration) which can be checked with thresholds
and neighborhood-based filters. The 3-D information extracted from the video sequences can be used to reconstruct the
animation model of the original sequence.
1 INTRODUCTION
The approach to human body modeling is usually split into two different cases: the static 3-D model of the body and the
3-D model of the motion. For pure animation purposes or definition of virtualized worlds, where the shape of the human
body is first defined and then animated (Badler 2000, Badler et al. 1999, Boulic et al. 1997, Gravila et al. 1996), only an
approximative measurement is required. An exact 3-D measurement of the body is instead required in medical
applications (Bhatia et al. 1994, Commean et al. 1994, Yumei 1994) or in manufacturing of objects which have to be
fitted to a specific person or group of persons; as for example in the space and aircraft industry for the design of seats and
suits (McKenna 1999, Boeing Human Modeling System) or more generally in clothes or car industry (Certain et al. 1999,
Bradtmiller et al. 1999, Jones et al. 1993, CyberDressForms). Recently, anthropometric databases have been defined
(Pauget et al. 1999, Robinette et al. 1999). Besides the shape information, they contain also other records of the person,
which can be used for commercial or research purposes (McKenna 1999). In the last years, the demand for 3-D models of
human bodies has drastically increased in all these applications. The currently used approaches for building such models
are laser scanner (Daanen et al. 1997, Cyberware), structured light methods (Bhatia et al. 1994, Youmei 1994), infrared
light scanner (Horiguchi 1998) and photogrammetry (Vedula et al. 1998). Laser scanners are quite standard in human
body modeling, because of their simplicity in the use, the acquired expertise (Brunsman et al. 1997) and the related
market of modeling software (Burnsides 1997). Structured light methods are well known and used for industrial
measurement to capture the shape of parts of objects with high accuracy (Wolf 1996, GOM). The acquisition time of both
laser scanner and structured light systems ranges from a couple of seconds to half minute. In case of human body
164 International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B5. Amsterdam 2000.
mo
Pho
The
diff
tim
ver
Dig
ac
me
the
bee
im:
al.
Th
are
fec
Th
sql
col
me
In
de
tw
tra
ex
oo. à QAO