Close-range imaging, long-range vision

  
however not reproduce the real face precisely. To solve this 
problem, some solutions (Lee and Magnenat-Thalmann, 2000) 
work in combination with range data acquired by laser scanners. 
Another image-based method consists of automatically 
extracting the contour of the head from a set of images acquired 
around the person (Matsumoto et al., 1999; Zengh, 1994). The 
obtained data are combined to form a volumetric model of the 
head. The set of images can be generated moving a single 
camera around the head or having the camera fixed and the face 
turning. The systems are fast and completely automatic, 
however the accuracy of the method is low. 
Video sequences based methods (Pighin et al., 1998; Fua, 2000; 
Liu et al, 2000; Shan et al, 2001) uses photogrammetric 
techniques to recover stereo data from the images. A generic 3- 
D face model is then deformed to fit the recovered (usually 
noisy) data. These techniques are full automatic but may 
perform poorly on face with unusual features or other 
significant deviations from the normal. 
High accuracy measurement of real human faces can be 
achieved by photogrammetric solutions which combine a 
thorough calibration process with the use of synchronized CCD 
cameras to acquire simultaneously multi-images (Banda, 1992; 
D'Apuzzo, 1998; Minaku et al., 1999; Borghese and Ferrari, 
2000; D'Apuzzo, 2001). To increase the reliability and 
robustness of the results some techniques use the projection of 
an artificial texture on the face (Banda, 1992; D'Apuzzo, 1998). 
The high accuracy potential of this approach results however in 
a time expensive processing. 
For our purposes, we are interested in an automatic system to 
measure the human face relatively fast and with high accuracy. 
We have therefore chosen a photogrammetric solution. Five 
synchronized CCD cameras are used to acquire simultaneously 
multi-images of a human face and artificial random texture is 
projected onto the face to increase the robustness of the 
measurement. The processing consists of five steps: acquisition 
of images of the face from different directions, determination of 
the camera positions and internal parameters, establishment of 
dense set of corresponding points in the images, computation of 
their 3-D coordinates and generation of a surface model. Due to 
the simultaneous acquisition of all the required data, the 
proposed method offers the additional opportunity to measure 
dynamic events. 
In this paper, we present the equipment used, the method and 
the achieved results. 
2. METHOD 
In this section, are described the system for data acquisition and 
the method used for its calibration and depicted the methods for 
the measurement and modeling of the human face from the 
acquired multi-images. 
An advantage of our method is the acquisition of the source 
data in fractions of a second, allowing the measurement of 
human faces with high accuracy and the possibility of 
measuring dynamic events such as speech. Another advantage 
of our method is that the developed software can be run on a 
normal home PC reducing the costs of the hardware. We are 
developing a portable, inexpensive and accurate system for the 
measurement and modeling of the human face. 
2.1 Data acquisition and calibration 
Figure 1 shows the setup of the used image acquisition system. 
It consists of five CCD cameras arranged convergently in front 
of the subject. The cameras are connected to a frame grabber 
which digitizes the images acquired by the five cameras at the 
resolution of 768x576 pixels with 8 bits quantization. 
face 
  
X , CCD cameras 
  
  
  
  
  
SI 
[frame grabber — | pc 
Figure 1. Setup of cameras and projectors 
A color image of the face without random pattern projection is 
acquired by an additional color video camera placed in front of 
the subject. It is used for the realization of a photorealistic 
visualization. 
Since the natural texture of the human skin is relatively 
uniform, the projection of an artificial texture onto the face is 
required to perform robustly the matching process. A random 
pattern (see figure 2) is preferred to regular patterns to avoid 
possible mismatches and its resolution has to be fine enough to 
result in the images in structures the size of few pixels. The use 
of two projectors enables a focused texture even on the lateral 
sides of the face; figure 3 shows the five images acquired by the 
CCD cameras. 
  
Figure 2. Projected random pattern 
  
Figure 3. Multi-images of a face with random pattern projection 
-242- 
  
fS 
MT 
m— e M 0M m 0 U US 
Fx s M—- us abe Debo eR A FàA RC 5S4 N o4 1: 7
1
2
...
255
256
257
258
259
...
640
641
Full text: Close-range imaging, long-range vision

Access restriction

Copyright

Note to user