Full text: Proceedings, XXth congress (Part 4)

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004 
(a) (b) (c) 
(d) (e) 
Figure 2. The manual texture extraction and mapping process involves (a) digital capturing of the building facade and removal of 
lens distortions, (b) selection of quadrilateral image sections, (c) rectification of perspective distortions, (d) retouching of 
occluded areas and (e) texture placement on building model. 
The presented texture extraction approach ties in from when 
the exterior orientation of the photographs is known. It de- 
scribes an efficient way to get from the correspondences of 
3D object and image points to a completely textured building 
model that is ready for visualisation purposes. A naive way 
to visualise this kind of data would be to define the corre- 
sponding points in the image as texture coordinates and use 
e.g. a VRML viewer for scene rendering. This is not a viable 
approach, however, as complex geometric image transforma- 
tions can not be realised due to perspective effects or lens 
By using the functionality of 3D graphics hardware, the 
whole process can be realised very efficiently so that interac- 
tive tools are possible. The user e.g. defines the input photo- 
graphs and extraction parameters and observes the resulting 
textured building model in real-time. Any self occlusions of 
the model are detected automatically, so that several images 
can be fused to get the final texture images. The camera dis- 
tortions are removed transparent to the operator in the hard- 
ware so that no extra care needs to be taken. 
Because 3D APIs and SDKs nowadays provide powerful and 
functional rich interfaces, such a texture extraction system 
can be realised with very little programmable effort. 
The texture extraction approach described in this article 
utilises new technologies that can be found in today's 
commodity 3D graphics hardware. Especially three 
developments are of greater importance and shall be briefly 
discussed in the following sections. 
2. Graphics Processing Units 
The extraction of façade textures from digital images mainly 
involves the transformation of vertices and the processing of 
pixel data, computations that can be highly parallelized for 
increased performance. Because the main CPU does not 
exploit this parallelism very effectively, a software solution 
is generally not an adequate approach. Graphics processing 
units (GPU) that are integrated in today's commodity PC 
graphics cards, however, are optimised for this kind of data 
processing. As graphics processors have evolved from a fixed 
function to a programmable pipeline design, they can now be 
utilised for various fields of applications. 
2.2 Shader Languages 
Shaders are small programs that are executed on the 3D 
graphics card. They can be conceived of as functions that are 
called within the GPU at specific points during the 
generation of the image. Two types of shaders exist: vertex 
shaders replace the transformation module in the geometry 
stage and pixel shaders replace the processing of individual 
pixels in the rasteriser stage of the graphics rendering 
Nowadays, shaders can be developed using High-Level 
Shader Language (HLSL developed by Microsoft) (Gray, 
2003) or C for graphics (Cg developed by NVIDIA) 
(Fernando and Kilgard, 2003). Both are based on the 
programming language C and offer the flexibility and 
performance of an assembly language, but with the 
expressiveness and ease-of-use of a high-level language. 
In the presented approach, pixel shaders are used to exert 
control over (projective) texture lookups, the depth buffer 
algorithm and to realise the on-the-fly removal of lens 
distortions for calibrated cameras. 
2.3 Floating-Point Texture Format 
Textures in floating-point format can hold real 32 bit colour 
values per channel. If used in combination with a pixel 
shader, a texture must not necessarily hold colour values, but 
rather all kinds of per pixel floating-point data can be stored 
in it. The pixel shader knows how to interpret the data in 
order to compute the output colour and depth values. 
Through the use of standard 3D APIs like OpenGL or 
Direct3D, the extraction algorithm is just a matter of setting 
up the rendering pipeline and to provide the vector 
information of the building geometry. The complexity of the 
core algorithm amounts to only a few lines of code. 
It is assumed that the building geometry is already available 
as a 3D geo-referenced, polygonal surface model. The input 
photographs were taken with a calibrated camera and their 
exterior orientations are known. Hence, the transformations 
that project the faces of the building model into the images 
