contours is by geometric primitives, such as parallel, corner,
and collinear [Fua and Hanson 1991, Kim and Muller 1995,
Lin et al. 1995, Henricsson 1995]. Relating contours by
defining geometric primitives requires many parameters and
specific object models, for example flat and rectilinear roofs.
To be able to handle arbitrarily complex roof shapes, we in-
stead propose to form a measure that relates contours based
on similarity in position, orientation, and photometric and
chromatic properties [Henricsson and Stricker 1995].
Following [Henricsson and Stricker 1995], for each straight
contour segment we define two directional contours point-
ing in opposite directions. Two directional contours form a
contour relation with a logically defined interior. For each
contour relation we compute four normalized scores based
on similarity in position, orientation, and in photometric and
chromatic attributes. The final similarity score of a contour
relation is the sum of the individual similarity components. A
high similarity score proposes that the two contours belong to
the same object boundary. A few selection procedures, which
are based on local competition on the computed similarity
scores, are subsequently applied to yield a small number of
similarity relations.
By relaxing the geometrical arrangement of two straight con-
tours, we can handle arbitrarily complex polygonal shapes.
These similarity relations are extensively used in coplanar
grouping (section 6.2) and to hypothesize the roof bound-
ary (section 6.3).
6 AUTOMATIC HOUSE RECONSTRUCTION
We present a novel approach to reconstruct complex residen-
tial houses from sets of aerial images. To solve this prob-
lem, we have developed a procedure that relies on hierarchi-
cal hypothesis generation, see Fig. 7. The procedure starts
with a multi-image coverage of a site, extracts 2-D edges
from a source image, computes corresponding photometric
and chromatic attributes, and their similarity relationships.
Using both geometry and photometry, it then computes the
3-D location of these edges and groups them to planes. In
addition, 2-D enclosures are extracted and combined with
the 3-D planes to instances of our roof primitive — the 3-D
patch. All extracted hypotheses of 3-D patches are ranked
according to their geometric quality. Finally, the best set of
3-D patches that are mutually consistent is retained, thus
defining the reconstructed house. This procedure has proven
powerful enough so that, in contrast to other approaches to
generic roof extraction, e.g. [Fua and Hanson 1991, Roux
and McKeown 1994, Lin et a/. 1995, Haala and Hahn 1995,
Kim and Muller 1995], we need not assume the roofs to be
flat or rectilinear or use a parameterized building model.
Note that, even though geometric regularity is the key to
the recognition of man-made structures, imposing constraints
that are too tight, such as requiring that edges on a roof form
ninety degrees angles, would prevent the detection of many
structures that do not satisfy them perfectly. Conversely,
constraints that are too loose will lead to combinatorial ex-
plosion. Here we avoid both problems by working in 2-D and
3-D, grouping only edges that satisfy loose coplanarity con-
straints, and weak 2-D geometric and similarity constraints
on their photometric and chromatic attributes. None of these
constraints is very tight but, because we pool a lot of infor-
mation from multiple images, we are able to retain only valid
object candidates.
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
2-D Framework 3-D Framework
n
Edge Detection
Edgel Aggregation
Attributes 2D. )———— — — ——, Segment Matching
CO Coplanar Grouping
y
-——>>>"
ce CE
Figure 7: A hierarchical framework, a feed-forward scheme,
where several components in the 2-D scheme mutually ex-
change data and aggregates with the 3-D modules.
0
We view the contribution of this approach as the ability to
robustly combine information derived from edges, photomet-
ric and chromatic area properties, geometry and stereo, to
generate well organized 3-D data structures describing com-
plex objects while keeping the combinatorics under control.
Of particular importance is the tight coupling of 2-D and 3-D
analysis. In section 5 we described the 2-D framework, and
in the following sections we present the 3-D framework and
the combination of 2-D and 3-D processing.
6.1 Segment Stereo Matching
Many methods for edge-based stereo matching rely on ex-
tracting straight 2-D edges from images and then matching
them. These methods, although fast, they have one draw-
back: if an edge extracted from one image is occluded or
only partially defined in one of the other images, it may
not be matched. In outdoor scenes, this happens often, for
example when shadows cut edges. Another class of meth-
ods [Baltsavias 1991] consists of moving a template along the
epipolar line to find correspondences. This can be extended
through the introduction of camera models and geometri-
cal constraints to a multi-image (feature/template based)
matching technique. Very promising results have been ob-
tained with this approach in close range applications [Griin
and Stallmann 1991]. It is much closer to correlation-based
stereo and reduces the problem described above. We pro-
pose a variant of the latter approach for segment matching
[Bignone 1995]. Edges are extracted with the methods in
section 5 from only one image (the source image) and are
matched in the other three images by maximizing an “edgi-
ness measure" along the epipolar line. The edginess measure
is a function of the gradient in the other images. Geometric
and photometric constraints are also used to reduce the num-
ber of mismatches. Each matched 3-D segment has a virtual
link to its generating 2-D contour, and vice versa.
The photometric constraint consists of computing the pho-
tometric region attributes as defined in section 5.2 after a
photometric equalization of the images. The photometric
consistency means that the photometry in areas that per-
tains to at least one side of the correspondences should be
similar across images. Figure 8A shows all 78 computed 3-D
326
segmen:
are desc
A more
velopme
extracte
edges a
fined tc
to redu
geomet
number
stereo r
(A)
Figure
matche
simultz
that th
6.2 (
To grc
metho
The pi
from s
tage tl
similar
of mis
ceeds
Leonat
Expl
f
F
r
f
r
For th
and a
tours
consis
rectly
plane
conto