. Istanbul 2004
ng Artificial
on, Vol.
Signal
EE
itelligence,
tion of
Interactive
etry and
:m for
natic
1 Space
g Support
ion. In: IEEE
9 Rico.
etection
shed PhD
Learning
Institute of
8. A General
ational
/ascona/datase
ning, Machine
S:
XVI(95), pp.
y Learning.
emote Sensing,
gnition from
ional Archives
SA,
nt Systems,
asets/fthood/,
ing Theory.
uilding
ional Archives
lam, The
AUTOMATIC BUILDING RECONSTRUCTION FROM AERIAL IMAGES: A GENERIC BAYESIAN
FRAMEWORK
Franck Taillandier ^, Rachid Deriche "
* Institut Géographique National/MATIS, 2-4 avenue Pasteur, 94165 Saint-Mandé - franck.taillandier@ign.fr
^ INRIA/ODYSSEE. 2004 Route des Lucioles - BP 93 - 06902 - Sophia-Antipolis Cédex, France - Rachid.Deriche @ sophia.inria.fr
Commission III, WG III/4
KEY WORDS: Building, Reconstruction, Modeling, Aerial, Image, Automation, Systems, Photogrammetry.
ABSTRACT:
A novel system for automatic building reconstruction from multiple aerial images is presented. Compared to previous works, this
approach uses a very generic modeling of buildings as polyhedral shapes with no overhang, in which external knowledge is introduced
through constraints on primitives. Using planes as base primitives, the algorithm builds up an arrangement of planes from which a 3D
graph of facets is deduced. In a so-called "compatibility graph" where the nodes are the initial facets of the 3D graph and edges between
two nodes state that both facets belong to at least one common hypothesis of building, it is shown that maximal cliques supply all the
hypotheses of buildings that can be deduced from the arrangement of planes. Among these hypotheses the choice is done through a
bayesian formulation that balance data adequacy and caricature needs. Results are provided on real images and show the validity of the
approach that remains very generic on the contrary to model-based methods while bringing external architectural information through
geometric constraints, which generally lacks in data-driven algorithms.
1. INTRODUCTION
1.1 Context
Building reconstruction is of primary importance in many appli-
cations such as virtual tourism, fighting simulations, urban en-
vironment management. Unfortunately, manual reconstruction
through usual photogrammetric techniques is a heavy burden and
a lot of research tend towards automated solutions. Automatic
building reconstruction from multiple aerial images is however
a difficult problem due to the complexity of the roof structures
present in urban or suburban areas and the presence of occlusions
such as vegetation. Most automatic approaches have to deal with
antagonistic aspects: on the one hand handling the tremendous
complexity of roof structures and thus supplying a very generic
modeling of building and on the other hand allowing simplifi-
cations of shapes when needed, which is often done in manual
reconstruction for dormer windows, chimneys but also for small
building recesses. The goal of an automatic algorithm is thus to
provide an “acceptable” caricature of generic building; accept-
able meaning that it meets some specifications. Furthermore,
systems have to face errors of primitives detectors: under or over-
segmentation, geometrical inaccuracy. To overcome these errors
and to deal with mandatory simplifications, external knowledge
must be introduced, either as models of buildings, which reduce
the generality of the approach, either as constraints on primitives,
which is the choice made in this article.
1.2 State of the Art
In the context of building reconstruction from multiple aerial im-
ages, systems can be classified as data-based or model-based. In
the former one, authors (Baillard and al., 1999; Heuel and al.,
2000; Ameri and Fritsch, 2000; Scholze and al., 2002) have
chosen not to restrict the set of available shapes for roof struc-
tures. They often use only one kind of primitives (3D segments
for (Baillard and al., 1999; Scholze and al., 2002), corners in
(Heuel and al., 2000) and planar patches in (Ameri and Fritsch,
2000)) and handle under-segmentation or primitives errors with
difficulty. On the contrary, the latter ones (Fuchs and Le-Men,
1999; Fischer and al., 1998) use some models of buildings to
restrict the set of possible shapes. This external knowledge en-
ables to overcome lack of detection and over detection. Although
these two approaches use different method to search for the best
model, they both try to solve for the lack of generality inherent in
these strategies either through heuristic rules in (Fischer and al.,
1998) or through graph grammars in (Fuchs, 2001). They both
provide promising results but, despite obvious efforts, are still
limited to simple forms and thus can not handle all the shapes
available in urban or suburban areas. Besides, the robustness of
the approaches lies intrinsically in the small amount of models.
Increasing the library of models would result in a increased com-
plexity and a least robustness.
In another context, with cadastral limits, (Jibrini, 2002; Flamanc
and al., 2003) searches, from a set of planes for the best contin-
uous polyhedral surfaces enclosed in the cadastral limits. This
very generic modeling handles over-detection of planes. Our
strategy inherits form this method with extensions to deal with
vertical planes and to integrate external knowledge through ge-
ometric constraints in the choice of model representation and in
the reconstruction step.
1.3 Global Algorithm
We decide to model a building as any polyhedral surface with no
overhang whose external border is constituted of vertical planes.
This definition remains very general and can represent almost any
building seen from aerial images in urban area with exception of
curved building. The algorithm focuses on individual building
reconstruction with a focusing zone manually selected through a
threshold on altitude, although automatic methods could be used
instead. Multiple calibrated images (with a resolution of 25cm in
our case) or products computed from these images such as DEM
(Baillard and Dissard, 2000) are used. Figure 1 summaries the
general scheme of the system that can be subdivided in three main
steps: primitives detection, hypotheses extraction and choice of
the best representation together with geometric refinement. In
the primitives detection step, starting from correlation DEM and
a rough focusing zone, the algorithm extracts planar patches and
oriented portions of facades that represent the base primitives
(Figure 7(a)). In a second step, from the arrangement of planes
deduced from these base primitives, it builds up a 3D graph of
facets and then a so-called “compatibility graph” where the nodes
are the initial facets of the 3D graph and edges between two nodes
state that both facets belong to at least one common hypothesis of