Figure 1: Fisrt stage observes a facial image sequence
with markers. The trace P of the markers are mea-
sured by a tracking program. In our experiment thir-
teen points are located around lips. A marker is a
white square (10mm x 10mm) sticker in the center of
which a black circle (2mm) is printed.
the user, to a sample face image set of subjects.
Instead of an image set , we prepare an image
sequence which observes a facial motion. We put n
markers on the face of a subject (Fig.1). A tracking
program traces the two-dimensional position of the
markers on the camera plane. A 2n-length vector p;
denotes the location of these n points at time i. An
N, x N, vector f; is the gray scale values of the pixels
around lips(figure 2) where N; x N, is the size of area.
Matrixes F and P are the times-series measurements
of vectors f; and p; subtracted by the expectation
value.
E f^ Ljnsinihas]
P= {pi — P,..-,Pn — P}
If we suppose f and p are controlled by a poten-
tial parameter set x which governs the facial affine
model, we can model the mechanism by the follow-
ing equations. k
f=Mrz +f
p=Mpz +p
Consequently, the matrix F of the image sequence
can be factorized into the following equation.
RAT Mr
| P | ^d tp Mp [aim m]
YK 0 T
Up 0 0 Vi
The lower expression of the right-hand side is sin-
gular value decomposition of of [FT PTT.
Once Up,Up, f, and p are determined, we can
compute an estimation p’ of p’ for input image f" us-
ing the result of above singular value decomposition.
p'=UpU5'[ f' —-f ]|+P,
where Ur! is general inverse of Up. Thus, it is pos-
sible to estimate the position of the markers from
450
Figure 2: A vector f of gray level values are measured
within the area indicated as a square. For reduction
of computational cost, the pixel size of the image is
re-quantized by 1:4.
Learning Image with Markers
Camera
Face t] CUIR of Markers
e. MF:Facial Image
Mi] Observation Matrix
MP
v
SVD of Matrix
Y
Estimation Matrix
Motion Estimation without Markers
Camera
Y
EH * Estimation of
Marker Position
Face
MF:Facial Image
Figure 3: Block-diagram of the process. In the first
learning stage, an image vector f; of the face is mea-
sured with the positions p; of markers. For the se-
quence {i = 1,...,n}, singular value decomposition
is applied. Using the result the algorithm estimates
virtual marker position only from f.
camera input without markers. In the method we do
not use explicit representation of x.
In Covell's paper, index 7 indicates the subject
number. They correlate faces { f } and control points
{p} with Up and Up. Instead, we correlate perioral
image and the location of feature points.
Figure 3 summarizes our process. Image sequence
used in the first learning stage must span enough or-
thonormal basis in order that the estimation func-
tions in the second stage. If the sequence is insuf-
ficient, the output of the second stage would be in-
complete.
3. EXPERIMENTS
We observed face by a camera located in front of a
subject. Thirteen markers are placed on the sub-
Figure 4:
image seq
is 640x 46
lips. (b)
148x84 i:
ject’s fac
program
trajectori
We te:
e Hor
e Che
Each.
digitized
sampling
the seque
8 seconds
square ar
used in t]
for f. T
37x21.
Figur
position
estimatic
are used
of these
frame 20
horizont:
images. '
labeled 2
Figur
applied t
output a
tication.