EXPLOITING PHOTOGRAMMETRIC METHODS FOR BUILDING EXTRACTION IN AERIAL IMAGES
Jefferey A. Shufelt
Digital Mapping Laboratory
School of Computer Science, Carnegie Mellon University
5000 Forbes Avenue, Pittsburgh, PA 15213-3891 USA
Email: js@maps.cs.cmu.edu
Commission Ill, Working Group 2
KEY WORDS: Vision, photogrammetry, recognition, extraction, modeling, image understanding, geometric constraints, au-
tomated building detection
ABSTRACT
Traditional computer vision techniques for automated building extraction have neglected the use of photogrammetric camera
modeling as a source of geometric information. By incorporating knowledge about the image acquisition geometry at every
phase of a building detection process, robust performance can be achieved on a wide variety of scenes. This paper describes
the role of rigorous photogrammetric camera modeling in PIVOT, a fully automated building extraction system that uses only
a single view to generate three-dimensional structure hypotheses. We present both qualitative and quantitative results on a
varied set of complex aerial imagery.
KURZFASSUNG
Traditionelle Techniken aus dem Computer-Vision Bereich zur automatischen Gebaudeextraktion haben die Verwendung pho-
togrammetrischer Kameramodelle als geometrische Information vernachlaessigt. Durch die Einbeziehung von Wissen über
die Geometrie der Bildaufnahme auf jeder Stufe der Gebäudeerkennung können robuste Ergebnisse für eine Reihe von
Szenen gewonnen werden. Dieser Beitrag beschreibt die Rolle der Kameramodellierung in PIVOT, einem vollautomatischen
Gebäudeerkennungssystem, das Einzelbilder zur Ableitung dreidimensionaler Strukturhypothesen verwendet. Wir präsentieren
sowohl qualitative als auch quantitative Ergebnisse für eine Reihe verschiedener, komplexer Luftbilder.
1 INTRODUCTION of geometric constraints for building extraction. A particu-
larly attractive feature of these constraints is that they do
not limit the scope of a building extraction system, since the
constraints are intrinsic to the imaging acquisition process.
Recent preliminary work illustrated the effectiveness of the
combination of photogrammetric modeling with computer vi-
sion techniques [McGlone and Shufelt, 1994].
Building extraction from aerial images has been a topic of
great interest in the computer vision community for sev-
eral years. The compilation of detailed digital cartographic
databases over suburban and urban areas requires accurate
modeling of manmade structures, a task currently accom-
plished by tedious and error-prone manual techniques. Sys-
tems capable of partially or fully automating the building ex- In this paper, the effects of photogrammetric model-
traction process would permit more efficient generation of ing are discussed in the context of PIVOT (Perspective
accurate building models. From a research standpoint, build- Interpretation of Vanishing points for Objects in Three di-
ing extraction also presents a challenging test for computer mensions), a fully automated monocular building extraction
vision techniques. A system which achieves robust perfor- system under development at the Digital Mapping Labora-
mance on aerial imagery must be able to address a wide va- tory. PIVOT employs a canonical data-driven approach to
riety of viewing angles and object shapes, correctly interpret building detection, constructing intermediate features from
object and shadow occlusions, and distinguish natural and raw edge data, and generating building hypotheses from those
manmade features. intermediate features. A major distinction between PIVOT
and the systems preceding it is the thorough integration of
photogrammetric modeling in all phases of the building ex-
traction process.
Traditionally, computer vision techniques for building extrac-
tion have neglected the use of photogrammetric camera mod-
eling, instead treating the image as the sole source of in-
formation. This restrictive view of the problem mandates
the use of constraints on the image and the scene, to make
existing vision algorithms tractable. Both region-based and
feature-based techniques make strict assumptions about im-
age geometry and scene content, and consequently exhibit
poor performance on imagery where buildings are not eas-
ily segmented by intensity criteria alone, or where complex
shapes are prevalent and oblique viewing angles violate as-
sumptions about image acquisition geometry.
2 VANISHING POINTS AND BUILDING
PRIMITIVES
Under a central projection camera model, a set of parallel
lines in a scene projects to a set of lines in the image which
converge on a single point, known as a vanishing point. Be-
cause each vanishing point corresponds to a unique orienta-
tion in 3-space, detecting these points leads to a powerful
approach for inferring 3D structure from 2D images. The
The central idea behind the research described in this pa- classical technique for detecting vanishing points [Barnard,
per is that rigorous photogrammetric camera modeling not 1983] utilizes a Gaussian sphere, a unit sphere with origin at
only allows generation of building hypotheses in object space, the perspective center. The endpoints of each line segment
a necessity for realistic cartographic applications [McKeown in the image form planes with the perspective center, known
and McGlone, 1993], but also serves as a valuable source as interpretation planes. Using the sphere as an accumula-
74
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B6. Vienna 1996
Fig
plat
tor
sph
tog
seg
geo
Gai
nol:
can
ing
sho
orie
sph
pos
elin
sph
the
the
det
An
no
of
no
util
To
chc
Th
fro.
to
anc
var
stri
an
to
str