Figure 5 (a) and (b) show the labeled objects in the whole
binary image and in the clipped binary image around both
pupils.
I
(a) in the whole image
(b) in the clipped image around both pupils
Figure 5 Objects in binary image.
There are only a few objects in the clipped image, so that
the processing speed is enhanced. As the result of
experiments, it was made clear that both pupils moving
With the head were extracted continually without markers.
The time interval between the frames is about 0.09 sec.
This processing speed is fast enough to chase both pupils
in real time.
3.2 Recognition of head motions using DP matching
The motions such as shaking, tilting, bending backward,
bending forward to gaze on the CRT display, keeping still
and nodding are taken as the basic actions in the perplex
situations. Nodding shows the motion of head with the
change of the glance between the CRT display and the
keyboard frequently. To capture these 6 motions from
the image sequence, the displacement velocity in 4
directions such as x (right and left), y (up and down), z
(forth and back), and Ó (the rotation about z axis) are
438
calculated. The positions in x and y direction are
obtained by calculating the coordinates of the middle point
(Xo » Vo) between both pupils.
r, +1,
Xo = > (3)
r, +1
Yo = A y (4)
Since the distance between both pupils corresponds to the
distance between the face and the camera, we treated
Euclid distance between both pupils as the relative
position zy in z direction. The rotation angle 6, is
calculated by using the inclination of the straight line tied
between both pupils.
Z >= (=) +0, 7) (S)
6, = arctan hy - (6)
Let v,[i], v,[i], v.[i ], and v[/] be the displacement velocity
in x, y, z, and 6 direction in the frame No.i, respectively.
Then, vi], v,[/], v4] and v[/] are expressed as
~J
V ]=xoli]-xoli -1] (7)
V 1=voli]-voli -1] (8)
v,li]=zol]-zoli -1] ©)
vli]- e,li]- eli - 1] (10)
oo
where xoli ], Voli ], Zo[/ ], and 6,[i] are xy, yo, Zo, and Ojin
the frame No./, respectively. The template of each action
is defined as the sequence of the characteristic vector
which is made up of these 4 components.
The template of each action is made as follows. : One of
the subjects repeats each action. The relationship
between the frame number and the displacement velocity
in 4 directions is shown in the display for every action.
The most suitable part in the characteristic vector
sequence is picked out for every behavior by hand. Each
basic action takes from 1 to 2 sec. The length of
templates is determined 20 frames in consideration of the
time interval between frames and the time required for
each basic action. Since there are the differences in the
magnitude among the 4 components, the normalization
has to be carried out. The displacement velocity of which
the absolute value is largest of all action pattern data is
picked up for every component. The coefficient of which
the maximum absolute value is converted to 1 is
multiplied to all data for every component. Figure 6 (a)-
(f) show the templates of the head motions.
DP (Dynamic Programming) matching is utilized here as
the method of the pattern matching between unknown
input vector sequence and the templates. The unknown
input pattern T and one of the template patterns, R, are
expressed as
T =4a,da,a,, ,a;,-,a,
R =b,b,,b,,---,b,,--,b,
where a; al
No./ in pat
No j in pat'
20. Each
Displacment velocity [relative value]
mu
9 0
=
«©
ems
(o
>
rm
+ .
a 0.
©
> 0.
>
T |
"T
oO
©
m
© 0.
+
c- 4
c -0.
E
3
& -0.
a
o
— -0..
©