Fua, Pascal
HUMAN SHAPE AND MOTION RECOVERY USING ANIMATION MODELS
P. Fua, L. Herda, R. Plänkers, and R. Boulic
Computer Graphics Lab (LIG), EPFL
CH-1015 Lausanne, Switzerland
(Pascal.Fua;Ralf.Plaenkers,Lorna.Herda,Ronan.Boulic j @epfl.ch
ABSTRACT
Deriving human body shape and motion from optical or magnetic motion-capture data is an inherently difficult task. The
body is very complex and the data is rarely error-free and often incomplete. The task becomes even more difficult if one
attempts to use video-data instead, because it is much noisier.
In the last few years, we have developed techniques that use sophisticated human animation models to fit such noisy and
incomplete data. It is acquired using a variety of devices, ranging from sophisticated optical motion-capture systems to
ordinary video-cameras. We use facial and body animation models, not only to represent the data, but also to guide the
fitting process, thereby substantially improving performance.
1 INTRODUCTION
In this paper, we show that we can effectively use sophisticated human animation models to fit noisy and incomplete data
acquired using a variety of methods. So far, these methods include optical motion capture systems, calibrated sets of
images or uncalibrated video-sequences. In all cases, the models are used throughout the tracking and fitting processes to
increase robustness. As time goes by, we intend to extend this to an even larger set of acquisition devices.
Optical Motion Capture: It has proved an extremely effective means to replicate human movements. It has been
successfully used to produce feature-length films such as "Titanic" that features hundreds of digital passengers with such
level of realism that they are indistinguishable from real actors. The most critical element in the creation of digital humans
was the replication of human motion: "No other aspect was as apt to make or break the illusion.”(Titanic Special Reprint,
1997) Optical motion capture offers a very effective solution to this problem and provides an impressive ability to replicate
gestures. Strolling adults, children at play and other lifelike activities have been recreated in this manner. The issues are
slightly different for game-oriented motion capture. Capturing subtleties is less important because games focus more on
big and broad movements. What matters more is the robustness of the reconstruction process and the amount of human
intervention that is required.
In this last respect, commercially available motion capture systems are still far from perfect. Even with a highly profes-
sional system, there are many instances where crucial markers are occluded or when the algorithm confuses the trajectory
of one marker with that of another. This requires much editing work on the part of the animator before the virtual char-
acters are ready for their screen debuts. To remedy this weakness, we have proposed the use of a sophisticated anatomic
human model to increase the method’s reliability (Herda et al., 2000).
Video-Based Modeling: The use of markers also tends to make such systems cumbersome. Videogrammetry is there-
fore an attractive alternative: It uses a cheap sensor and allows not only "markerless" tracking but also precise body-shape
modeling. In this work, we combine stereo-data and body outlines. These two sources of information are complementary:
The former works best when a body part faces two or more of the cameras but becomes unreliable where the surface slants
away, which is precisely where silhouettes can be used.
However, image-based data often is noisy and incomplete. Again, we use the animation models, not only to represent the
data, but also to guide the fitting process, thereby substantially improving performance for both face and body modeling.
Given ordinary uncalibrated video sequences of heads, we can robustly register the images and produce high-quality
realistic models that can then be animated (Fua, 2000). The required manual intervention reduces to supplying the
location of 5 key 2-D feature points in one image. For bodies, we recover complex 3-D motions by fitting our articulated
3—D models to the image data (D' Apuzzo et al., 1999, Plánkers et 21., 1999).
International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B5. Amsterdam 2000. 253