Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision", Graz, 2002 
  
3D MODELING AND REGISTRATION UNDER WIDE BASELINE CONDITIONS 
L. Van Gool! ?, T. Tuytelaars!, V. Ferrari?, C. Strecha!, J. Vanden Wyngaerd!, and M. Vergauwen! 
! ESAT/PSI/Visics, KULeuven, Belgium 
2 D-ITET/BIWI, ETH Zurich, Switzerland 
KEY WORDS: wide baseline, 3D reconstruction, 3D registration, invariant neighbourhoods 
ABSTRACT 
During the 90s important progess has been made in the area of structure-from-motion. From a series of closely spaced 
images a 3D model of the observed scene can now be reconstructed, without knowledge about the subsequent camera 
positions or settings. From nothing but a video, the camera trajectory and scene shape are extracted. Progress has also 
been important in the area of structured light techniques. Rather than having to use slow and/or bulky laser scanners, 
compact one-shot systems have been developed. Upon projection of a pattern onto the scene, its 3D shape and texture can 
be extracted from a single image. This paper presents recent extensions on both strands, that have a common theme: how 
to cope with large baseline conditions. In the case of shape-from-video we discuss ways to find correspondences and, 
hence, extract 3D shapes even when the images are taken far apart. In the case of structured light, the problem solved is 
how to combine partial 3D patches into complete models, without a good initialisation of their relative poses. 
1 INTRODUCTION 
During the last few years, low-cost and user-friendly so- 
lutions for 3D modeling have become available. Shape- 
from-video (Armstrong 1994, Heyden 1997, Pollefeys 
1998, Hartley 2000) extracts 3D shapes and their textures 
from video sequences as the only input. One-shot struc- 
tured light techniques (Vuylsteke 1990, Proesmans 1996, 
Chia 1996, Eyetronics www) get such information from a 
single image, but need the projection of a special pattern. 
These techniques have the advantage that they are cheaper 
than traditional solutions like dedicated multi-camera rigs 
or laser scanners, as they only require off-the-shelf hard- 
ware. Moreover, they offer more flexibility in terms of 
portability and the range of object sizes they can handle. 
This paper presents ongoing work on two different, but 
strongly related extensions of such systems. 
Wide-baseline image matching: Shape-from-video 
requires large overlap between subsequent frames. 
Often, one would like to reconstruct from a small 
number of stills, taken from very different view- 
points. Based on local, viewpoint invariant features, 
wide-baseline matching is made possible, and hence 
the viewpoints can be farther apart. 
Crude registration of 3D patches: Automatic registra- 
tion algorithms for 3D patches such as ICP require 
good initial, relative positions and orientations of the 
patches to work. Completely automatic solutions to 
the 3D puzzle of putting together a set of unstructured 
3D patches requires that a first, crude registration 
also takes place automatically. 
2 WIDE-BASELINE IMAGE MATCHING 
2.1 Task description 
The 90s have witnessed the appearance of self-calibration 
techniques in structure-from-motion. A series of images is 
the only input such systems need to determine the camera 
motion and the evolution of the camera settings, as well 
as the 3D shape (up to an unknown scale) of the scene. By 
now, several approaches for such self-calibration have been 
developed and several systems have been proposed (Arm- 
strong 1994, Heyden 1997, Hartley 2000, Pollefeys 1998). 
They start with the tracking of interest points through a 
sequence of views. The consistency of their image pro- 
jections with a rigid 3D structure imposes constraints that 
allow to extract the cameras and the 3D shape of the cloud 
of interest points. The matching of these initial interest 
points will be referred to as sparse correspondence search. 
After the matching of the interest points, and the self- 
calibration, strong multi-view constraints between the im- 
ages are available. These ease the search for many more 
correspondences. For one thing, a further search can be re- 
stricted to epipolar lines. In our approach (Pollefeys 1998), 
we go after pixelwise matches. This stage is referred to as 
dense correspondence search. These additional matches 
result in a detailed reconstruction of the 3D shape. 
Although 3D reconstructions can in principle be made 
from a limited number of stills, these systems tend to only 
work effectively if the images have much overlap and are 
offered in the order of a continuous camera motion This is 
underlined by the name 'shape-from-video'. For instance, 
we have tested our system (Pollefeys 1998) to make 3D 
records of archaeological, stratigraphic layers during ex- 
cavations. A large part of the scene consists of sand and 
there is a general lack of points of interest. When walk- 
ing around the dig, it proved necessary to take images less 
than 5? apart. In such application, this is not always pos- 
sible due to obstacles, and it disturbs the normal progress 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.