ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision", Graz, 2002
RECONSTRUCTING 3D BUILDING WIREFRAMES FROM MULTIPLE IMAGES
Ahmed F. Elaksher, James S. Bethel, and Edward M. Mikhail
School of Civil Engineering, Purdue University, 1284 Civil Engineering Building, West Lafayette, IN 47906, USA
elaksher@ecn.purdue.edu bethel@ecn.purdue.edu mikhail@ecn.purdue.edu
ABSTRACT
Building extraction in urban areas is one of the difficult problems in image understanding and photogrammetry. Building
delineations are needed in cartographic analysis, urban area planning, and visualization. Although one pair of images is adequate to
find the 3D position of two visibly corresponding image features it is not sufficient to extract the entire building due to hidden
features that are not projected into the image pair. This paper presents a new technique to detect and delineate buildings with
complex rooftops by extracting roof polygons and matching them using multiple images.
The algorithm discussed in this paper starts by segmenting the images into regions. Regions are then classified into roof regions and
non-roof regions using a two-layered Neural Network. A rule-based system is then used to convert the roof boundaries to polygons.
Polygon correspondence is established geometrically, all possible polygon correspondent sets are considered and the optimal set is
selected. Polygon vertices are then refined using the known geometric properties of urban buildings to generate the building wire-
frames. The algorithm is tested on a number of buildings and the results are evaluated. The RMS error for the extracted building
vertices is 0.25m using 1:4000 scale aerial photographs. The results show the completeness and accuracy that this method can
provide for extracting complex urban buildings.
1. INTRODUCTION
Recent research in the area of building extraction covers
building extraction from aerial images, digital elevation
models (DEM), thematic maps, and terrestrial images. Aerial
images and digital elevation models are the primary data sets
used in most building extraction systems. Some systems use
only aerial images; some use DEM only; others use both data
sets. In (Suveg and Vosselman, 2000) thematic maps and
GIS databases are used to help resolving ambiguity in the
extracted buildings or in generating building cues that can be
refined by DEM, aerial images, or both. In (Chein and Hsu,
2000) one pair of images is used to extract buildings. This is
insufficient since parts of the buildings can be either
obscured by other features or not projected into this specific
pair. Although in (Kim and Nevatia, 1999) more than one
pair of images is used to extract the buildings, the matching
was carried out in a pairwise fashion.
In (Brunn and Weidner, 1997) the DEM is used solely to
extract the building models. Researchers using only DEM in
building extraction start by segmenting the DEM. This
process is problematic with image-driven DEM and LIDAR-
driven DEM. Outliers often disturb the extracted regions.
Some researchers compute the slope of the model surface in
both directions to provide more information that can assist
the building extraction. However slopes are not always
accurate due to outliers in the digital elevation models. In
(Wang, 2000) a building extraction system from LIDAR data
is presented. His results suffer from some of the same
problems that occur with image-driven DEM.
In (Zhao and Trinder, 2000) and (Seresht and Azizi, 2000)
aerial images and DEM are used for the building extraction
process. They started with the DEM to provide building
regions or building cues and then they used the images to
refine the extracted building regions.
In (Fischer et. al., 1999) and (Fórstner, 1999) the building
extraction problem was solved using a semi-automated
approach. The user has to define the building model and find
the building elements in one image by a number of mouse
clicks. Then the algorithm finds the corresponding features in
other images, and matches them to build the 3D wire-frame
of the building. This approach supports the extraction of
more complex buildings; however it requires the user to
spend a great amount of time interacting with the system.
In this article a new technique to extract urban area building
wire-frames using more than one pair of images is presented.
The input to the algorithm is a number of images for the
building. The minimum number required is two; however
this number is generall not enough for a complete
extraction. In this research four aerial images per building are
used in the extraction process.
We start by segmenting the images using a split and merge
image segmentation technique. The extracted regions are
then classified into roof regions and non-roof regions using a
two-layered Neural Network. Two attributes are used in the
classification process. The first attribute measures the
linearity of the building borders. The second attribute
measures the average elevation of the region and it is driven
from a digital elevation model. The borderlines for the
building regions are then extracted from the border pixels
using a modified version of the Hough transformation. A
rule-based system is then employed to convert the extracted
lines to polygons. The algorithm can extract either triangle or
quadrilateral roof facets. Correspondence between roof
polygons is established using the geometrical properties of
the polygons. A least squares estimation model is
implemented to find corresponding polygons. Geometric
constrains between vertices in one polygon, symmetric
planes, and horizontal planes are utilized in the least squares
model to refine the extracted building vertices.
The algorithm has been tested on a large sample of buildings
selected quasi-randomly from the Purdue University campus.
Four images are used for each building and the automatically
extracted wire-frames for the extracted buildings are