Figure 3: Automatic attachment of two adjacent triangular
prisms in the FLAT_L image, before and after
whose endpoints lie in close proximity. Each corner can then
be tested geometrically by comparing its labels with the prim-
itives in Figure 2. For example, if both arms of a corner are
labeled v, that interpretation can be discarded since a v-v
corner does not occur in either primitive. If a corner has no
legal labeling (recall that each line segment can have multiple
labelings) then it can be discarded. This method efficiently
prunes geometrically inconsistent features, an important as-
pect of mid-level feature generation [Fórstner, 1995].
The basic idea is not new [McGlone and Shufelt, 1994]. How-
ever, PIVOT extends the idea to a new intermediate represen-
tation: the 2-corner, which is formed by two corners which
share a common line segment, and corresponds to a portion
of the boundary of a primitive facet. For example, consider a
2—corner with the labeling v-h2-v. Such a 2-corner is legal,
since it occurs as part of the wall boundary of a rectangular
primitive (although at this stage, it has yet to be determined
whether the h2 line segment is the roof or ground segment).
2—corners are useful intermediate features because each build-
ing face can be partially represented by a 2-corner. An-
other intermediate feature which could have been employed
in PIVOT is the trihedral vertex, in which three line segments
76
meet at a point. The difficulty with trihedral vertices is that
they are not visible from certain viewing angles; for example,
trihedrals are often not present in conventional nadir map-
ping photography of rectangular structures. 2-corners can
be found in both nadir and oblique imagery, allowing PIVOT
to operate over a wide range of viewing angles. The combina-
tion of the 2-corner representation and vanishing point infor-
mation derived from photogrammetric modeling gives PIVOT
a useful intermediate representation for hypothesis construc-
tion.
4 CONSTRUCTING 3D BUILDING HYPOTHESES
Since each 2—corner corresponds to a portion of the boundary
of a primitive facet, PIVOT can use the 2-corner as a starting
point for locating the remainder of the primitive edges. First,
PIVOT resolves ambiguities in the 2-corner interpretation.
Recall that a 2—corner with the labeling v-h2—v is ambiguous;
the h2 segment could be on the roof or ground. This am-
biguity is resolved by determining which ends of the vertical
segments of the 2—corner are closer to the vertical vanishing
point in image space; slanted peak roof lines can be resolved
by a similar method. Once ambiguities are resolved, PIVOT
then executes another search to find line segments with the
correct vanishing point labelings at each of the points in the
2-corner. At the conclusion of this process, several of the
edge and point slots in a primitive have been filled in with
edge and point measurements from the image.
For a rectangular primitive, only one vertical and two or-
thogonal horizontal line segments need to be present for the
positions of all eight points of the primitive to be computed
in image space by intersecting vanishing lines; for a triangular
prism, only the long horizontal and one of the triangular facet
edges need to be present. PIVOT tries all possible combina-
tions of the edges in the primitive slots to generate complete
2D primitives, discards any completions which do not obey
the vanishing line geometry, and selects the best one with
respect to the underlying edge data for the image, using a
chamfer distance metric.
After this process, PIVOT has a set of fully-specified primi-
tives, measured in image space. PIVOT then uses the camera
model and a DEM (digital elevation model) to compute the
object space positions of the floor points; the lengths of ver-
ticals and horizontals can then be measured in object space
to obtain the 3D positions of the remaining points in each
primitive. This process results in a set of 3D object space
primitives, derived automatically from a DEM, the use of a
central projection camera model, and monocular cues.
However, edge fragmentation can cause a single building in
a scene to be modeled by several primitives. Further, de-
pending on the viewpoint, primitives may not be found for
components of the building. These problems require the abil-
ity to join primitives to form composite building structures,
and the ability to extrapolate from existing primitives, re-
spectively. PIVOT solves the first problem by joining prim-
itives which have similarly shaped faces in close proximity;
the second problem is solved for peaked roof buildings by us-
ing vertical edges and shadow analysis to estimate the height
displacement of triangular prisms from the ground.
Figure 3 illustrates an example of primitive attachment on the
FLAT.L scene, an image distributed as part of a test on im-
age understanding techniques [Fritsch et al., 1994]. PIVOT
initially generates two triangular prisms for a single build-
X
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B6. Vienna 1996
Fis
tri
ing
by
ve
th
ide
ite
go
to