A good survey of the approaches to stereo matching covering
methods upto the mid-80s may be found in [5]. Intensity
correlation of small neighborhoods has been used for stereo
matching since early times and continues to be popular even
in current systems [9]. Such systems work reasonably well in
presence of random texture and smoothly varying terrain, but
are less effective in cultural environments with abrupt depth
changes and large homogeneous areas (such as in scenes with
many buildings).
Figure 5 Registered Model
Feature matching techniques have been applied for scenes
containing man-made objects as these objects tend to provide
boundary features that are largely invariant to the viewing
geometry (of course, they could be hidden). Individual
features, such as a straight line, remain quite ambiguous. If
more than two images are available, the ambiguity can be
resolved by employing the epipolar constraints between the
different pairs [1]. If only two images are available, matches
of groups of features need to be considered to resolve this
ambiguity and several techniques for this are described in the
literature ([3], [4], [5], [8], [15], [18]).
Note that matching features such as lines necessarily gives
only sparse depth information about the scene and detection
of objects, such as buildings, requires a further step of
grouping of lower level features. This leads to considering the
construction of the desired groups first and matching the
groups. A system following this approach for the detection of
rectilinear buildings is described in [16]. In this system,
parallelograms that may correspond to roof boundaries are
hypothesized by grouping in the two images individually.
These groups are then matched to select among them and to
determine heights.
As observed before, there is a clear trade-off between the
levels at which the stereo matching is performed. Higher
level features can be matched with much less ambiguity,
however, errors may be made in their computation in the first
place. We believe that, in general, it is not possible to
determine a correct level of matching in advance. Rather,
572
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
GROUPING & MATCHING OF
LINE SEGMENTS AND JUNCTION
GROUPING & MATCHING OF
PARALLELS
Y
DETECTION OF
PARALELLOGRAM MATCHES
v
WALL AND SHADOW
VERIFICATION
COMBINATION OF BUILDINGS
WEM Aus
3D BUILDING
|
|
Figure 6 Block Diagram of Multi-View system
matching should occur at various levels, with the results of
one level influencing the others and choice among matches
made only when sufficient context becomes available to
resolve confidently. An early system advocating this
approach is described in [12]. We have recently constructed a
system using such an approach [17] for the detection of
rectilinear buildings: block diagram is shown in Figure 6. A
characteristic of this system is that the images to be matched
need not be taken at the same time; changes due to
illumination and imaging conditions can be accommodated.
An example is shown in Figure 7 through 10. Figure 7 shows
two views of a scene containing two buildings. Note that the
two images are taken at different times and are of different
resolutions. Figure 8 shows the lines that are detected in each
image independently. The lines are then matched among the
available images and sets of possible matches are computed.
The matched lines are then grouped into higher level features
(parallels and parallelograms) and the groups are then
matched across the images (using the information about the
matches of their constituents). Figure 9 shows the roof
hypotheses.The hypotheses are verified by looking for
looking for supporting evidence from the visible walls and
the shadows. Figure 10 shows the two selected and verified
buildings from the available hypotheses. At this point 3D
models of the buildings have also been computed.