Full text: Proceedings; XXI International Congress for Photogrammetry and Remote Sensing (Pt. B5-2)

633 
AUTOMATED 3D RECONSTRUCTION OF URBAN AREAS FROM NETWORKS OF 
WIDE-BASELINE IMAGE SEQUENCES 
HelmutMayer, JanBartelsen 
Institute of Geoinformation and Computer Vision, Bundeswehr University Munich - (Helmut.Mayer, 
Jan.Bartelsen)@unibw.de, www.unibw.de/ipk 
KEY WORDS: Computer Vision, Virtual Landscape, Close Range Photogrammetry, Visualization, Urban Planning 
ABSTRACT: 
The efficient automated reconstruction of highly detailed 3D models of urban areas for visualization and analysis is an active area of 
research for diverse applications ranging from surveillance to architecture. A flexible and cheap data source are wide-baseline image 
sequences generated with hand-held consumer cameras with several to tens of Megapixels. Image sequences are particularly suitable 
for the reconstruction of 3D structures along linear objects such as roads. This paper presents an approach for 3D reconstruction from 
image sequences taken with a weakly calibrated camera with no need for approximations for position and attitude, markers on the 
ground, or even ground control. The generated 3D reconstruction result is relative, i.e., the scale is not known, but Euclidean, that is, 
right angles are preserved. The paper shows that the approach allows to produce a 3D reconstruction consisting of points, camera 
positions and orientations, as well as vertically oriented planes from image sequences taken with a Micro Unmanned Aerial Vehicle 
(UAV) under challenging wind conditions and without navigation information. Finally, the paper discusses how sequences can be 
linked into networks, or also images into blocks, clarifying which image configurations exist and how they can be adequately treated 
when prior knowledge about them is available. 
1. INTRODUCTION 
A recent special issue of the International Journal of Computer 
Vision on “Modeling and Representations of Large-Scale 3D 
Scenes” (Zhu and Kanade, 2008) with a special focus on urban 
areas exemplifies the importance of the field with applications 
in “mapping, surveillance, transportation planning, archaeology, 
and architecture” (Zhu and Kanade, 2008). Of particular interest 
are (Pollefeys et al., 2008, Comelis et al., 2008) which like us 
employ images as primary data source, yet with a focus on 
video data taken from cars and using GPS and INS data. 
Contrary to this, our approach for 3D reconstruction is aiming at 
wide-baseline scenarios with basically no need for 
approximations for position and attitude or markers in the scene. 
While our previous work was on uncalibrated cameras (Mayer, 
2005), we now assume that the camera is weakly calibrated, 
meaning that principal distance and point as well as sheer are 
known up to a couple of percent. Based on this assumption we 
can use the 5-point algorithm of (Nister, 2004) which makes the 
reconstruction much more stable, particularly for (nearly) planar 
scenes. 
While no approximations for position and attitude are needed 
and also the images are allowed to be rotated against each other, 
the images of the sequence still have to fulfill certain constraints 
to obtain a useful result. First of all, all triplets of images in the 
sequence have to overlap significantly, to allow for the reliable 
propagation of 3D structure. Additionally for a reliable 
matching, the appearance of the visible objects should not 
change too severely from image to image and there should not 
be large areas with occlusions. 
We introduce our approach to 3D reconstruction from 
wide-baseline image sequences in Section 2. Besides camera 
orientations we reconstruct 3D points and from them planes 
which are a good means to describe dense 3D structure in urban 
areas, e.g., to determine visibility. 
This gives way to the 3D reconstruction from image sequences 
taken from a Micro Unmanned Aerial Vehicle (UAV) presented 
in Section 3. In spite of the lack of information on strongly 
varying position and attitude of the camera we could still orient 
the images and produce a 3D model including textured planes. 
The experiences with the UAV led us to an analysis of different 
imaging configurations, consisting of sequences which can be 
linked at the ends or also in between, in both cases leading to 
networks, as well as more random configurations which can 
give way to image blocks. In Section 4 we show how the 
different configurations can be adequately treated. We finally 
end up with conclusions. 
2. 3D RECONSTRUCTION FROM WIDE-BASELINE 
Our current approach for 3D reconstruction from wide-baseline 
image sequences extends (Mayer, 2005) to a (weakly) calibrated 
setup. It starts by extracting points (Forstner and Gulch, 1987). 
The eigen-vectors of the points are employed to normalize the 
orientation of the image patches (Mayer, 2008) subsequently 
used for cross-correlation employing color information. If the 
correlation score is beyond a low threshold of 0.5, affine least 
squares matching (LSM) is used. Matches are checked a second 
time via the correlation score after matching, this time with a 
more conservative threshold of 0.8. 
From corresponding points in two or three images essential 
matrices or calibrated trifocal tensors (Hartley and Zisserman, 
2003) are robustly computed using the five point algorithm by 
(Nister, 2004) in conjunction with Random Sample Consensus - 
RANSAC (Fischler and Bolles, 1981). To obtain a more reliable 
solution we employ the robust geometric information criterion - 
GRIC of (Torr, 1997). For three images two times the five point 
algorithm is employed with the same reference image and the 
same five points in the reference image. Because of the
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.