641
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B5. Beijing 2008
The normal vector m can be defined by the intersection of the
projection plane C 67 with the image plane as shown in Figure 1
and represented in the equation (4),
m x x +m y y - m z f = 0 (4)
where m x , m v , m z are the coordinates of the normal vector in the
camera coordinate system and/is the focal length of the camera;
x and y are points on the image edge. Given an observed image
of edge 67, the observed normal vector m can be obtained by
the equation (5),
m = (Wi ~fY x ( x 2 > y 2 -fY ( 5 )
The location and orientation of the building 2 can be
represented by a building vertex (e.g. vertex 3 (X 3 , Y 3 ) in the
Figure 1), a building orientation along the X axis (e.g. the CL in
Figure 1), and the building’s dimension of length, width,
and height (e.g. L, W, H in Figure 1). Those unknown
parameters are solved in the metric reconstruction stage.
2.2 Recovery Algorithm
The recovery algorithm takes as input, a set of correspondences
between edges in the models and edges in the image. The
correspondences are performed manually. The algorithm then
automatically recovers camera pose and model dimensions,
consisting of self-calibration and metric reconstruction. In the
first step, the focal length is firstly obtained from image EXIF
tags. The camera pose and the model parameters are recovered
with respect to an object-centred coordinate system. In the
second step, the spatial relationship of buildings is represented
by three intrinsic parameters (building length, width, and height)
and three extrinsic parameters (a building vertex location and
building orientation). Those parameters can be determined by
using model-to-image correspondence and the recovered
camera pose.
problem. The direct calculation of Jacobian matrix of the
objective function O t is complex. To simplify the linearization
of 0/, we rewrite the rotation matrix R as a multiplication of
three sequential rotations, and compute the first derivative for
each rotation angle. The Jacobian matrix of 0/ can then be
formed as,
^ mtpK
nx 3
m \ R m V \ m \ R v V \ m \ R K V\
ml R n,v„ rn‘Rv, m[R„v
n UJ n n (p r " **
T
n T n
where
'0
0
o'
0
0
-COS AT
K=R
0
0
l
> K =
<p
0
0
sin a:
0
-l
0
cos/r
-sin*"
0
/? =
0 1 0
-10 0
0 0 0
Given the three initial camera rotations obtained from the
previous step, the Gauss-Newton algorithm computes accurate
estimates of the camera rotations within 2-3 iterations.
Determination of Camera Translation and Model
Dimensions
The objective function for determining camera translation and
model dimensions is formulated according to Equation (3) as
shown in Equation (7),
o 2 =XK%-/)) 2
(7)
2.2.1 Self-Calibration
The self-calibration requires more than three line
correspondences between the pre-defined model edges and the
image edges, which consists of initial estimate of camera
rotation, refinement of camera rotation, and determination of
camera translation and model dimensions.
Initial Estimate of Camera Rotation
where i is the number of the model edges, n is the total number
of the employed model edges, m, and w, are the corresponding
normal vector and point on the model edge. In the case of
rectilinear buildings, the minimization of the objective function
0 2 is a constrained quadratic form minimization problem, and
can be solved through a set of linear equations. It is also
important to keep in mind that the resulting dimensions of the
scene and camera translations are up to a scale factor.
The objective function of obtaining initial estimates for camera
rotation is formulated according to the Equation (2) as shown in
the Equation (6),
o, =£( m , r Kv,y (6>
where i is the number of the model edges, n is the total number
of the employed model edges, w, and v, are the corresponding
normal vector and direction of the model edge, R is 3x3 camera
rotation matrix. By summing up the extents to which the
rotation R violates the constraints arising from Equation (2), the
objective function can be minimized to obtain initial values for
the camera rotation
2.2.2 Metric Reconstruction
The metric-reconstruction also requires more than three line
correspondences between the pre-defined model edges and the
image edges, which consists of initial estimate of building
orientation, refinement of building orientation, and
determination of building dimensions and location.
Initial Estimate of Building Orientation
The three directions of model edges, v ; (e.g. model edge 67 of
the building 2 in Figure 1), v 2 (e.g. model edge 78), and v 3 (e.g.
model edge 27), can be represented as shown in Equation (8).
W sin a'
'0
-1
0'
f°'
v i =
W cosa
» V 2 =
1
0
0
V„
V 3 =
0
, o ,
0
0
1
Refinement of Camera Rotation
Once initial camera rotation is obtained, a non-linear technique
based on Gauss-Newton method is applied to the minimization