The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Voi. XXXVII. Part B5. Beijing 2008
methods here. Interest readers may refer the paper (Guillou et
al., 2000) for more information.
1.2.2 Model-Based Methods: We regard model-based methods
(e.g. Debevec et al. 1996, Wang and Ferrie, 2008) as evolution
from line-based methods (e.g. Liu et al. 1990, Kumar and
Hanson 1994). The model-based methods use structure
information inherent in the objects which is ignored in the line-
based approaches. Early attempt (Liu et al. 1990) solved for the
camera rotation first and then the camera translation using both
lines and points correspondence. They considered three camera
rotation angles as obtained from a nominal orientation by small
perturbations, e.g. 0 degrees. Based on this assumption, their
algorithm only works if the three camera Euler rotation angles
are less than 30 degrees. Kumar and Hanson (1994) solved for
the rotation and translation simultaneously by adapting an
iterative technique formulated by Horn (1990). They also
reported that the initial rotation estimates for some data sets
must be within 40 degrees for all the three Euler angles
representing the rotation. When initial estimates for rotation and
translation are not available, they sampled rotation space, and
each of the samples was used as an initial estimate for the
rotation estimation by a method akin to Liu et al. (1990). The
estimated rotation and translation based on the rotation samples
are then used as initial estimates for solving the camera rotation
and translation simultaneously. Taylor and Kriegman (1995)
estimated both the camera positions and the structure of the
scene from multiple images. Based on a random initial estimate
of rotation, the translation and model parameters are computed
as initial inputs for the subsequent model-to-image fitting
procedure. If the disparity between predicted edges and the
observed edges is smaller than some preset threshold, the
minimum is accepted as a feasible estimate. Debevec et al.
(1996) argued that if the algorithm begins at a random location
in the parameter space, it stands little chance of converging to
correct solution. They developed a method to directly compute
a good initial estimate for the camera positions and model
parameters, and then use those estimates as initial inputs for the
subsequent model-to-image fitting process.
Our approach builds on this line of work. Described is a two-
step iterative scheme for recovering camera orientation that,
unlike existing methods, does not require a good initial guess
for the rotation. Instead, the good initial estimate for the
rotation is computed directly by using coplanarity constraints.
The camera translation and predefined model parameters are
determined based on the calculated rotation through a linear
least squares minimization. The 3D reconstruction of buildings
is based on the recovered camera pose and the assumption of
flat terrain. Unlike existing methods, our method does not
require a model-to-image projection process, and is particularly
suitable for oblique images with large shooting angles in urban
environments. 2
2. THE METHOD
2.1 Notation
Figure 1 shows how a straight line segment, model edge 67, in a
cube model (building 1) projects onto the image plane of a
camera. The coordinates of two endpoints of the projected
image edge 67 in the camera coordinate system can be
represented as {(x h y h -f), (x 2 , y 2 , -/)}• The camera position
relative to the object coordinate system is represented in terms
of a rotation matrix R and a translation vector t. The straight
line 67 can be defined by a pair of vectors (v, u) in the object
coordinate system where v represents the direction of the line
and u represents a point on the line, m is normal vector of the
projection plane defined by the two lines (C 6 , C 7 ) and camera
centre C in the camera coordinate system. The coplanar
constraints derived in (Taylor and Kriegman, 1995) are outlined
in the following. The fundamental relation of the imaging
geometry can be represented by the equation (1),
m = R(vx(u -/)) (1)
Equation (1) is based on the fact that the 3D model lines (e.g.
line 67) in the camera coordinate system must lie on the
projection plane formed by lines (C6, C7) and camera centre C.
m T Rv =0 (2)
m T R(u-t) = 0 (3)
Equations (2) and (3) are deduced from equation (1), which
shows that the determination of camera rotation R can be
independent from the estimation of camera position t and model
parameters. Note v becomes a known vector in the object-
centered coordinate system which is parallel to the Y axis,
while u can be represented by the model parameters.
onto a camera’s image plane and spatial relationship of
buildings