sentation at video rate opens up a new class of applica- 5.1 The z key Method 5.2 Real-
tions for 3D vision, such as real-time z keying (Kanade et
al., 1995). Figure 9 shows an example of image merging by z keying. Figure 10
The z key method requires four image inputs: a real image tion. In this
In visual media communication and display, it is often nec- IR(i,j), its depth map IRd(i,j), a synthetic image IS(i,j) and synthetic
essary to merge a video signal from a real camera and a its depth map 1Sd(i,j), where (i,j) are pixel coordinates. The objects in
synthetic video signal from computer graphics. A standard synthetic image and its depth map are typically created by perform z
technique for merging video signals is chroma keying,
which is used, for example, in TV weather reports. Figure
8 (a) illustrates the chroma keying method. A weatherman
is imaged by a real camera in front of a blue screen, and
pixels which have blue color, that is, the portions of the
scene that are not occluded by the real objects, are
replaced by the synthetic image. Thus, video merging by
chroma keying extracts a real world object and overlays its
image on the synthetic world. In other words, chroma key-
ing assumes that a real world object is always foreground.
The z key method we have developed is a new image key-
ing technique which uses pixel-by-pixel depth information
(a depth map) of real scenes. For each pixel, the z key
switch compares the depth information of real and syn-
thetic images, and routes the pixel value of the image that
is nearest to the camera. Thus we can determine the fore-
ground image for each pixel and create virtual images
where each part of the real and synthetic objects occlude
each other correctly, as illustrated in the output image of
Figure 8 (b).
The critical capability for realizing this video-rate z keying
is video-rate pixel-by-pixel depth mapping of a real scene.
We have used the CMU video-rate stereo machine for the
real-time z keying demonstration.
blue background
camera
color
some rendering software. We assume that a proper rang-
ing device provides the depth map IRd of the real scene to
the z key switch in real time. For each pixel with coordi-
nates (i,j), the z key switch compares the two depth
images 1Sd(i,j) and IRd(i,j) and uses the image that has the
pixel nearer to the camera for its output image 10(i,j). The
output image 10(i,j) is thus described as:
IRd (i, j) <1Sd (i, j)
Io.) [nen when
IRd (i, j) >1Sd (i. j) (6)
IS (i j) when
As a result, real world objects can be placed in any desired
and correct relationship with respect to virtual world
objects. For example, in the output image of Figure 9, part
of the real object (e.g., a hand) occludes a virtual object
(e.g., a lamp), which in turn occludes a real object (e.g., a
body), which further occludes the virtual room wall.
In many cases extraction of the regions of objects in real
scenes has to precede z key switching. Such extraction
can be obtained by selecting pixels where corresponding
depth values are smaller than a certain threshold. Also,
chroma key or luminance key can be used prior to or in
conjunction with z key for object extraction.
IS(ij)
Switching
if by chroma key
S ;
8 This pape
machine a
IRd(i,j) à ISd(i,j)
se
(a) a chroma key method IRd ci j) xISd (i, j) e lau: Es WI
UC (4, SEN ond. This
a pixel-by-pixel ranging device Y Z key Switch magnitude
color
o
o
Switching
*- by z key
e|
ol |
O||®
(b) a z key method
Figure 8: An illustration of the difference between
chroma key and z key method
Note that in the output of chroma keying a real object is
simply placed in front of synthetic objects, while in the out-
put of z keying various parts of real and synthetic objects
occlude each other correctly.
an output image 10(i,j)
Figure 9: The Scheme for Z keying
414
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B5. Vienna 1996