Full text: CMRT09

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009 
AUTOMATED SELECTION OF TERRESTRIAL IMAGES FROM SEQUENCES 
FOR THE TEXTURE MAPPING OF 3D CITY MODELS 
Sébastien Bénitez and Caroline Baillard 
SIRADEL, 3 allée Adolphe Bobierre CS 24343, 35043 Rennes, France 
sben i tez @ siradel .com 
KEY WORDS: Building, Texture, Image, Sequences, Terrestrial, Automation. 
ABSTRACT: 
The final purpose of this study is to texture map existing 3D building models using calibrated images acquired with a terrestrial 
vehicle. This paper focuses on the preliminary step of automated selection of texture images from a sequence. Although not 
particularly complex, this step is particularly important for large-scale facade mapping where thousands of images might be 
available. Three methods inspired from well-know computer graphics techniques are compared: one is 2D-based and relies on the 
analysis of a 2D map; the two other methods use the information provided by a 3D vector database describing buildings. The 2D 
approach is satisfactory in most cases, but facades located behind low buildings cannot be textured. The 3D approaches provide 
more exhaustive wall textures. In particular, a wall-by-wall analysis based on 3D ray tracing is a good compromise to achieve a 
relevant selection whilst limiting computation. 
1. INTRODUCTION 
With the development of faster computers and more accurate 
sensors (cameras and lasers), the automatic and large-scale 
production of a virtual 3D world very close to ground truth has 
become realistic. Several research laboratories around the world 
have been working on this issue for some years. Früh and 
Zakhor have proposed a method for automatically producing 
3D city models using a land-based mobile mapping system 
equipped with lasers and cameras; the laser points are registered 
with an existing Digital Elevation Model or vector map, then 
merged with aerial LIDAR data (Früh and Zakhor, 2003; Früh 
and Zakhor, 2004). At the French National Geographical 
Institute (IGN), the mobile mapping system Stereopolis has 
been designed for capturing various kinds of information in 
urban areas, including laser points and texture images of 
building facades (Bentrah et ai, 2004). The CAOR laboratory 
from ENSMP has also been working on a mobile system named 
LARA-3D for the acquisition of 3D models in urban areas 
(Brun et ai, 2007; Goulette et ai, 2007), based on laser point 
clouds, a fish-eye camera, and possibly an external Digital 
Elevation Model. Recently, a number of private companies 
have commercialized their own mobile mapping systems for 3D 
city modeling, like StreetMapper or TITAN for instance 
(Hunter, 2009; Mrstik et al., 2009). The purpose of such 
systems is often the 3D modeling as well as the texture 
mapping of the 3D models. 
In this study we are interested in texturing existing 3D building 
models by mapping terrestrial images onto the provided façade 
planes. As a part of the mapping strategy, one first needs to 
determine which images each façade can be seen from. It is 
particularly important for large-scale facade texture mapping 
where thousands of images can be available. Every single 
image can be relevant for the final texturing stage. There are 
few references on this issue. In (Pénard et ai, 2005) a 2D map 
is used to extract the main building facades and the 
corresponding images. All the images viewing at least a part of 
a façade are selected. In (Haala, 2004), a panoramic camera is 
used and a single image is sufficient to provide texture for 
many façades. Given a façade, the best view is the one 
providing the highest resolution. It is selected by analyzing the 
orientations and distances of the building facades in relation to 
the camera stations. In (Aliène, 2008), a façade is represented 
by a mesh. Each face of the mesh is associated to one input 
view by minimizing an energy function combining the total 
number of texels representing the mesh in the images, and the 
color continuity between two neighbouring faces. 
In our study, only two triangles per facade are available, and a 
façade texture generally consists of a mixture of 4 to 12 input 
views. The following mapping strategy has been chosen for 
texturing a given façade: 
Pre-selecting a set of relevant input images, from 
which the façade can be seen; 
Merging these images into a single texture image; 
Registering the texture image with the existing façade 
3D model. 
This paper only focuses on the first stage. The purpose of this 
operation is to select a set of potentially useful images based on 
purely geometrical criteria. The generation of a seamless 
texture image without occlusion artifacts will be handled within 
the second stage. Three possible approaches for the image pre 
selection are presented and discussed. The first approach is 
similar to the one used in (Pénard et al., 2005) and relies on the 
analysis of a 2D map. The two other methods use the 
information provided by a 3D vector database describing 
buildings. All methods are based on standard techniques 
commonly used in computer graphics for visibility 
computations, namely the ray-tracing and z-buffering 
techniques (Strasser, 1974). These two techniques have now 
been used for decades and are very well known in the computer 
graphics community. They can easily be optimized and 
accelerated via a hardware implementation. 
This paper is organized as follows. Section 2 presents the test 
data set used for this study. Sections 3, 4 and 5 detail the three 
selection methods. The results and perspectives are discussed in 
section 6. 
2. TEST DATASET 
The test area is a part of the historical center of the city of 
Rennes in France. It is 1 km 2 wide and corresponds to the 
densest part of the city. Existing 3D building models were 
provided with an absolute accuracy around lm. It contains 1475 
buildings consisting of 11408 walls. The texture image database 
associated to the area was simulated via a virtual path created 
through the streets. A point was created every 5 meters along 
this path. Each point is associated to two cameras facing the left 
and the right sides of the path. The camera centers are located 
at 2.3 meters above the ground in order to simulate a vehicle 
height. The internal and external parameters of the cameras are 
approximately known. The path is about 4.9 kilometers long,
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.