an image pyramid. This method is based on a Hough transform
and relaxes the required quality of orientation angles to the
order of 10 degrees. The method presented in this paper is
feature-based, but differs from the approaches of Habib & Kelly
and Wang in the more extensive use of generic object
knowledge. Choices for parameterisation required by the Hough
transform are avoided by the use of rigorous statistical testing
of constraints on the observations. Furthermore, the proposed
method does not rely on image segmentation required by
structural matching.
In the use of image sequences without external measurement of
camera orientation, approximate orientation values of an image
are derived from the determined values of previous images in
the sequence. In these short-baseline applications area-based
matching is a commonly used technique to establish image
correspondence at sub-pixel level (Pollefeys et al., 2000). With
increasing baseline length, feature-based matching techniques
are expected to be more successful, especially in applications
where occlusions are frequent. The general approach of a
feature-based matching procedure is described by Matas (Matas
et al., 2001) and can be summarised as follows:
- Features that have viewpoint invariant characteristics are
extracted in both images.
- Based on their characteristics, the features of two images
are compared and a list of possible matches is established.
- A geometrically consistent set of matches is searched for.
In this step the RANSAC algorithm is often applied
(Fischler and Bolles, 1981).
Examples of this general approach are found in (Pritchett and
Zisserman, 1998), (Tuytelaars and Van Gool, 2000), and
(Baumberg, 2000). The method presented in this paper differs
in the following aspects:
- Object information, as described in the previous section, is
exploited at several stages. This makes the method robust,
but limits its applicability to images of buildings or man-
made structures with similar characteristics.
- . The features are straight image lines — projections of the
building edges — and not (invariant) image regions as in
the approaches above. The intersection of two straight
lines and a number of viewpoint invariant characteristics
are used in the matching.
- Vanishing point detection is applied for the initial
estimation of the orientation of the images relative to the
object. Apart from a remaining ambiguity in the rotation
matrix of each image, the vanishing point detection
reduces the relative orientation problem to a relative
position problem.
- The emphasis is on the use of geometric constraints for the
selection of the set of correct matches, and not on
photometric information. In the current application the
photometric content in a region around for instance the
corner of a window strongly depends on the viewpoint of
the image as a result of discontinuities in the facade of the
building. Furthermore, buildings often show repetitive
patterns, such as identical windows. As a result,
photometry is not a strong clue for detecting correct
matches.
- . Clustering is based on the results of statistical testing of
geometric constraints. All possible combinations of
tentative matches are tested. The number of tests is of the
order m” (with m the number of tentative matches). This is
in contrast to the RANSAC procedure in which the
selection of the final solution is based on a randomly
chosen subset of possible matches. In the method
presented here the computational burden is reduced to an
acceptable level by keeping the number of possible
matches low.
3. PROCEDURE FOR RELATIVE ORIENTATION
As stated in the introduction, the procedure consists of three
steps. An overview is depicted in Figure 1. If an uncalibrated
camera is used, an additional step that follows the vanishing
point detection is required (van den Heuvel, 1999). The
detection of the three main vanishing points is more reliable in
case of a calibrated camera. Then, after the detection of the first
vanishing point, the search space for the other two is reduced
considerably (van den Heuvel, 1998). In order not to complicate
the description of the procedure, use of a calibrated camera is
assumed.
line extraction i
lines
vanishing point
detection
labeled
lines
rotation
matrix
epipole detection
+
correspondence
model
coordinates
image
points
Y
relative
orientation
Figure 1. Overview of the procedure
3.1 Line feature extraction
Edges are automatically extracted by applying a line-growing
algorithm (van den Heuvel, 2001). The coordinates of the
endpoints represent the image lines. The interpretation plane is
the plane in which both the image line and object edge recede
(Figure 2). The image coordinates (x,y) are assumed to be
corrected for lens and image plane distortions. Then the spatial
vector (x) related to an endpoint can be written as:
X z(x,y,-f), f: focal length (1)
The normal to an interpretation plane (n) is computed from the
rays (X) to the endpoints of the line:
n-x,xX, (2)
This normal vector plays a major role in the vanishing point
detection procedure that is summarised in the next section.
—228-