anbul 2004
result of
'"herefore,
Lagrange
ing, more
e will be
d that the
> Newton
such as
's and no
with the
lency on
effect of
are some
rpolation
"s in this
ich more
eption is
gets in a
Avionic
ds for
ice-Hall,
ducing a
ra using
nference
lications,
Recovery
f IEEE
on and
teractive
r Society
gnition ,
ethod of
ra”, The
der : À
Aided
sium ,
URBAN VISUALIZATION THROUGH VIDEO MOSAICS BASED ON 3-D MULTI-
BASELINES
Jeachoon CHON, Tkashi FUSE, Eihan SHIMIZU
Dept. of Civil Engineering, TheUniversity of Tokyo, JAPAN
jie7151@trip.t.u-tokyo.ac.jp, (fuse, shimizu)@civil.t.u-tokyo.ac.jp
KEY WORDS: Video Mosaics, 3-D space, Image sequence, Virtual realization, GIS
ABSTRACT:
In case of using an image sequence taken from a video camera mounted on a moving vehicle ban, general image mosaicing
techniques based on a single baseline cannot create image mosaics. To solve the drawback, we proposed a new image mosaicing
technique that can create an image mosaic in 3-D space from the image sequence utilizing the concept of 3-D multi-baselines. The
key point of the proposed method is that each image frame has a dependent baseline calculated by using camera pose and average
depth between a camera and 3-D objects. The proposed algorithm consists of 3 steps: feature point extraction and tracking,
calculation of camera exterior orientation, and determination of multi-baselines. This paper realized and showed the proposed
algorithm that can create efficient image mosaics in 3-D space from a real image sequence.
1. INTRODUCTION
Image mosaicing technique, which builds an image covering
large areas through registering small 2-D images, can be used in
many difference applications like satellite imagery mosaics
(USGS Hurricane Mitch Program Projects), the creation of
virtual reality environment (Szeliski, R., 1996), medical image
mosaics (Chou ef al., 1997), and video compression ( Standard
MPEG4). Especially in GIS field, video mosaics are becoming
more and more common in civil engineering that is representing
urban environments, and managements of construction sites and
road facilities.
The image mosaicing techniques are fall into two fields. In
the first field, images and orthosatellite imagery, which is
obtained by using the direct linear transform based on spatial
data such as the digital element model, are registered to spatial
vectors. In the second field, general images of a perspective
projection are conjugated without spatial information. The
techniques of the second field enable us to obtain spatial
information and extract textures from stereo image mosaics.
Our research pertains only to the second filed. The mosaicing
techniques can be mainly divided into four categories: a 360
degree panorama based on cylinder baseline projection (Shum
et al., 2000), a spherical mosaics based on spherical baseline
projection (Coorg ef al., 2000), general video mosaics based on
a single baseline projection (Zhu er al., 2001), and x-slit images
that can create image mosaics without baseline (Assaf er al,
2003). In case of extracting a single texture of a facade and
spatial data from panoramas and spherical mosaics, the data
must be got through combing data extracted from several
panoramas and spherical mosaics (Coorg et al, 1999).
Moreover, since the transition among image mosaics is discrete,
the walkthrough in virtual reality is not smooth. On the other
hand, since general video mosaics technique is to get image
data in wide range, it is very efficient to extract textures of
facades and spatial data. The video mosaic technique creates an
image as projecting all of image sequence to a single baseline
(see Fig. 1(a)), but it can’t be applied to image sequence taken
727
from a translating and rotating camera (see Fig. 1(b)). The
single baseline is generally calculated by the average depth of
feature points extracted and matched at 1* and 2" image frames
in case of using perspective projection.
Even if the algorithm of the x-slit images has the merit of
creating an image mosaics from a translating and rotating
camera, image motion per image frame is limited to 1 pixel for
creating image mosaics as high resolution. Since the distance
between buildings and a moving camera in urban area is very
short range, generally the image motion is over 50 pixels at
least. It is difficult to apply the algorithm of the x-slit images to
urban area.
(
(a) General video mosaics
(b) A moving camera in a turning point
Fig. 1. Concept of general video mosaics.
To solve the drawback, this paper proposed a novel method
that can creates video mosaics in 3-D space based on 3-D multi-
baselines proposed by this paper. The core of the novel method
is that each image frame has a dependent baseline calculated by