§ = {zy eR? Fopn=T}, @
F(r,y, z) = S hou, (3)
2=l
fi(x.y,.2) = ezp(—2di(z,y, z)), (4)
where d; represents the algebraic ellipsoidal distance de-
scribed below. For simplicity's sake, in the remainder of
the paper, we will omit the ? index for specific metaball
sources wherever the context is unambiguous.
2.3 3-D Quadratic Distance Function
We use ellipsoidal primitives because they are simple and,
at the same time, allow accurate modeling of human limbs
with relatively few primitives because metaballs result in
a smooth surface, thus keeping the number of parameters
low. To express simply the transformations of these im-
plicit surfaces that is caused by their attachment to an ar-
ticulated skeleton, we write the ellipsoidal distance func-
tion d of Eq. 4 in matrix notation as follows. For a specific
metaball and a state vector O, we define the 4 x 4 matrix
Qe = Lo, t Cow, . (5)
where L and C are radii and position of the primitive re-
spectively. The skeleton induced transformation Se is in-
troduced as the rotation-translation matrix from the world
frame to the frame to which the metaball is attached. These
matrices will be formally defined in the appendix.
Given the Qe and Se matrices, we combine the quadric
and the articulated skeleton transformations by writing the
distance function of Eq. 3 as:
d(x, ©) = x" - SE - QE - Qe : Se : x. (6)
This formulation will prove key to effectively computing
the Jacobians required to implement the optimization scheme
of Section 4.
We can now compute the global field function F' of Eq. 3
by plugging Eq. 6 into the individual field functions of
Eq. 4 and adding up these fields for all primitives. In other
words, the field function from which the model surface 1s
derived can be expressed in terms of the Qe and Se matri-
ces, and so can its derivatives. These matrices will there-
fore constitute the basic building blocks of our optimiza-
tion scheme's implementation.
3 DATA ACQUISITION
From the video sequences, two kinds of 3-D information
are extracted: A three dimensional surface measurement
of the visible parts of the human body for each time step
and 3-D trajectories of points on the body. The process
consists of the following three steps:
e Data Acquisition and Calibration: The used im-
age acquisition system consists of three synchronized
progressive scan CCD cameras arranged in a triangle
form in front of the subject. The cameras are con-
nected to a frame grabber which digitizes the images
at the resolution of 640x480 pixels with 8 bits quanti-
zation.
The system is precalibrated using a 3-D reference
frame with signalized points and then finely calibrated
using thorough bundle adjustment techniques. The re-
sults of the calibration process are the exterior orien-
tation of the cameras, the parameters of the interior
orientation, the parameters for the symmetric radial
and decentering distortion of the lenses and two ad-
ditional parameters modeling differential scaling and
shearing effects. A thorough determination of these
parameters is required to achieve high accuracy in the
measurement.
e Matching Process and 3-D Point Cloud: Our ap-
proach for the matching process [D’Apuzzo, 2002]
is based on the adaptive least squares method with
the additional geometrical constraint of the matched
point to lie on the epipolar line. Starting from few
seed points, the matcher produces a dense set of cor-
responding points relatively fast, e.g. on a Pentium III
600 MHz machine, about 20,000 points are matched
in approximately 10 minutes.
The seed points are generated automatically applying
the Foerstner interest operator on the template im-
age to determine points where the matching process
may perform robustly; the corresponding points in the
other two images are then established automatically
by searching for the best matching results along the
epipolar line.
The template image is then divided into polygonal re-
gions according to which of the seed points is closest
(Voronoi tessellation). Starting from the seed points,
the set of corresponding points grows automatically
by sequential horizontal and vertical shifts, until the
entire polygonal region is covered. If the quality of
the match is not satisfactory, the algorithm works adap-
tively by changing parameters (e.g. smaller shift, big-
ger size of the patch). The process is repeated for each
polygonal region until the whole image is covered.
The result is a dense set of corresponding points in the
three images. The 3-D coordinates of the matched
points are computed by forward ray intersection us-
ing the orientation and calibration data of the cam-
eras. The mean achieved accuracy of the 3-D points
is about 2 mm.
e Surface Tracking: The tracking process [D' Apuzzo
et al., 2000] is also based on least squares matching
—258—