anbul 2004
| its corre-
1Ve recon-
esponding
pondences
rental ma-
nputed the
ow we fil-
than a cer-
s a subset
ntal matri-
associated
these fun-
supporting
Say a par-
milarly an
e two sup-
yj) equals
them is a
nsor Tish.
; the input
the tensor.
(corner in
sor, which
of images.
| less. The
to 30 per-
are valid.
of the pos-
ensors for
ondences.
ered from
e support-
ST; and
pondences
| represent
1 Cases we
d by Tjrl-
| chains of
ce. This is
ing tensor
lences that
as long as
jue identi-
correspon-
ist is then
t program
[nc., n.d.)
through a
8 AUTOCALIBRATION OF FOCAL LENGTH
To run the bundle adjustment it is necessary to know the
camera calibration. However, it was not necessary to know
the camera calibration to compute these correspondences.
As a side effect of computing these correspondences we
have a set of fundamental matrices between input images.
It is possible to autocalibrate the camera parameters from
these fundamental matrices.
The goal of autocalibration is to find the intrinsic camera
parameters directly from an image sequence without re-
sorting to a formal calibration process.
The standard linear camera calibration matrix Æ has the
following entries (Hartley, 1997b):
T ki, 0 Uo
C = 0 fkhoi 98 (1)
0 0 1
This assumes that the camera skew is 7/2. Here f is the
focal length in millimeters, and ka, k, the number of pix-
els per millimeter. The terms f'k,, fk, can be written as
Qu; Œy, the focal length in pixels on each image axis. The
ratio œ, / is the aspect ratio. It is often the case that all
the camera parameters are known, except the focal length
f. The reason is that many digital cameras have a zoom
lens, and thus can change their focal length. The other
camera parameters are specified by the camera manufac-
turer.
Thus a reasonable goal of autocalibration process is simply
to find the focal length. This can be done reliably from the
fundamental matrices that have been computed as part of
the procedure to find the correspondences between image
pairs (Roth, 2002).
8.1 Autocalibration by Equal Singular Values
If we know the camera calibration matrix Æ, then the es-
sential matrix E is related to the fundamental matrix by
E — C'FC. The matrix E is the calibrated version of
F; from it we can find the camera positions in Euclidean
space. Since F is a rank two matrix, E also has rank
two. However, E has the extra condition that the two non-
zero singular values must be equal. This fact can be used
for autocalibration by finding the calibration matrix C that
makes the two singular values of F' as close to equal as
possible (Mendonca and Cipolla, 1999). Given two non
zero singular values of E: 0, and 02 (01 > 02), then,
in the ideal case (c4 — 02) should be zero. Consider the
difference (1 — 02/01). 1f the singular values are equal
this quantity is zero. As they become more different, the
quantity approaches one. Given a fundamental matrix, au-
tocalibration proceeds by finding the calibration matrix X
which minimizes (1 — 02/01).
Assume we are given a sequence of n images, along with
their fundamental matrices. Then F;, the fundamental ma-
trix relating images and à + 1, has non zero singular val-
ues c;, and c;j5. To autocalibrate from these n images
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B5. Istanbul 2004
using the equal singular values method we must find the
K which minimizes yu! w;(1 — 0/01). Here w; is
a weight factor, which defines the confidence in a given
fundamental matrix. The weight w; is set in proportion to
the number of matching 2D feature points that support the
fundamental matrix F;. The larger this number, the more
confidence we have in that fundamental matrix. In the case
where only the focal length needs to be autocalibrated the
minimization of this quantity is a simple one dimensional
optimization process.
9 EXPERIMENTS
There are.as yet no standardized data sets for testing wide
baseline matching algorithms. However, there is one data
set that has been used in a number of wide baseline match-
ing papers (Schaffalitzky and Zisserman, 2002, Ferrari et
al., 2003, Martinec and Pajdla, 2002), which is the Val-
bonne church sequence as shown in Figure 2.
Figure 2: Twelve pictures of the Valbonne Sequence
This sequence has a number of views of the church at Val-
bonne, in France. These views are typical of what would be
used in a photogrammetric model building process. This
sequence was processed by our software. There were ap-
proximately 350 feature points over these twelve images,
and each feature point exists in at least five or more im-
ages. There are twelve images, so one would expect i27
fundamental matrices, and about 12? trilinear tensors to be
calculated. However, only about fifty percent of the max-
imum number of fundamental matrices is calculated, and
likewise, only thirty percent of the maximum number of
trilinear tensors. A rendering of the camera positions and
feature points is shown in Figure 3. The RMS residual er-
ror of each feature point when it is reprojected into the 2D
image is at most 0.8 pixels, and at least 0.1 pixels. Thus