International Archives of Photogrammetry and Remote Sensing. Vol. XXXII, Part 5. Hakodate 1998
FACIAL ANIMATION FROM SEVERAL IMAGES
Yasuhiro MUKAIGAWA!
Yuichi NAKAMURA*
Yuichi OHTA?
t Department of Information Technology, Faculty of Engineering, Okayama University
3-1-1 Tsushima-naka, Okayama, 700-8530 JAPAN
E-mail: mukaigaw@chino.it.okayama-u.ac.jp
t Institute of Information Sciences and Electronics, University of Tsukuba
1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8573 JAPAN
Commission V, Working Group SIG
KEY WORDS: facial animation, facial expression, image-based rendering
ABSTRACT
We propose a novel method for synthesizing facial animation with 3-D pose and expression changes.
On animation synthesis, one of the most important issues has been realistic face generation. Usual
methods with 3-D facial model, however, have not realized natural face synthesis which represents
the details and delicate changes of facial expressions.
In our method, a facial image is synthesized directly from multiple input images without explicit
reconstruction of 3-D facial shape. Since this method uses the actual images, realistic facial animation
which holds detailed facial features can be synthesized. The linear combination of multiple poses
realizes the 3-D geometric appearance changes, and the texture blending is used for the smooth
surface texture changes. Both of poses and expressions can be treated in a same framework in our
method, while they are handled separately in the usual methods.
1 INTRODUCTION
A human face includes various information such as
individuality and emotion. Techniques for generating
facial animations have been studied for many appli-
cations, such as a man-machine interface and movies.
However, a face is one of the most difficult objects for
image synthesis, because we are extremely sensitive
to differences between real face images and synthe-
sized face images. In this paper, we deal with both
pose and expression changes, and aim to synthesize
realistic facial images which is almost indistinguish-
able from real images.
Model-based rendering usually have been used for
this purpose. A 3-D shape model of a human head
is often used and the shape is deformed according to
the facial expression.
The 3-D shape model can be reconstructed from
several images by structure-from-motion (Ullman S.,
1979). But the reconstructed model usually includes
some errors, so the synthesized image becomes un-
natural. The acquisition of accurate model is difficult
without special devices such as a high precision range
finder (Akimoto T.,1993).
As the facial expression model, FACS (Facial Ac-
tion Coding System) is often used (Ekman P.,1997).
A facial expression is described as a combination of
the AU (Action Unit). A facial image with an expres-
sion is synthesized by deformation defined for each
AU, but it is difficult to simulate in details such as a
wrinkle.
Thus, synthesized images are still far from a real
face appearance as you may see in many applications.
Even small modeling errors cause undesirable effect
to the synthesized images.
On the other hand, there is another paradigm
called image-based rendering. It aims to synthesize
realistic images by using the textures from real im-
ages. For example, view morphing (Seitz S.M.,1996)
method generates a new image easily, which generates
intermediate views between two actual views.
In order to change the facial poses and expressions
of an input image, Poggio, et al. proposed some meth-
ods which are related to image-based rendering. The
906
linea:
one f:
natio
the n
other
ever,
for re
ities c
prese
D.,19
pressi
ages j
two v
Th
huma
this y
both
The 1
appez
for tl
avoid
rende
image
ages 1
imag:
The
struct
man |
ject c
2-D c
viousl
showe
synth
D rec
can b
2
Fig.1
set of
sions |
the fe:
lated |
tected
ture w
on to
featur
plain 1
and t]
2 |
3.1
Let B
poses.