TRIFOCAL
)ENCES
ck adjustment .
iization of algebraic
object points. It is
ws the object points
that the orientation
adjustment.
requirement to pro-
ior (and sometimes
vhich are inevitable
ment.
ioned above, a rea-
o steps: a) perform
ernative representa-
arameters obtained
subsequent bundle
result by consider-
| and modelling the
that the
il. The main reason
s essential,
lanar object. points.
not only happen for
points, but already
ired deviation from
by the other draw-
nal constraints, the
unknown non-linear
te the effects of the
early obtained ori-
rly coplanar object
om a common plane
if the internal con-
id if algebraic error
or this investigation
ich is made up of 27
escribes the relative
es. Compared with
o (the fundamental
,uong and Faugeras
] tensor with 81 el-
the trifocal tensor
matrix, due to the
he quadfocal tensor
This article is structured in the following way. Section 2
gives an short overview on the properties of the trifocal
tensor. In section 3 the results of synthetic experiments
are presented, followed by an example using real data in
section 4. The findings are summarized in section 5.
2 THE TRIFOCAL TENSOR
The trifocal tensor is made up of 27 homogenous elements,
thus can be visualized as a 3 x 3 x 3 cube of numbers. Slices
in every direction of this cube return 3 x 3 matrices with
special properties; e.g. [Ressl 2003]. These slices allow the
determination of the six epipoles in the three images and
the determination of the three respective fundamental ma-
trices. In case of unknown interior orientation, the latter
can be further used to derive a common interior orientation
for the three images using the so-called Kruppa equations:
e.g. [Hartley and Zisserman 2001]. Finally the fundamen-
tal matrices and the interior orientation can be used to
obtain the projection centers and rotation matrices of the
relative orientation of the three images; [Ressl 2003].
The trifocal tensor can be computed from corresponding
points and/or lines across the three images. Each triple
of points gives 4 independent homogenous equations, so-
called trilinearities, and each triple of lines gives 2 indepen-
dent equations - all being linear in the tensor’s elements.
Consequently, at least 7 point- or 13 line-correspondences,
or a proper combination, are needed for the direct linear
solution of the trifocal tensor minimizing algebraic error.
The relative orientation of three uncalibrated images has
only 18 DOF. Consequently 9 constraints must be satisfied
by the 27 tensor elements, one of the constraints is the
fixing of the tensor’s homogenous scale. Various sets of
constraints were proposed in the past; see [Ress] 2003] for
an overview.
For computing a valid trifocal tensor, which satisfies the
constraints, preferably by minimizing reprojection error in-
stead of algebraic error, we have to use the so-called Gauss-
Helmert model, [Koch 1999], also called general case of
least squares adjustment. This non-linear iterative method
requires approximate values for the tensor elements, which
could be obtained from the direct linear solution.
Note: In projective geometry every entity is represented as
an homogenous vector, e.g. a 2D point x as x — (x, y, yt.
Now suppose the point x is measured in a digital image
with 2000 x 3000 pixels and is located far away from the
origin of the coordinate frame. In this case the coordi-
nates r and y will be in the order of 1000, whereas the
homogenous extension still is 1. This difference in order
between the Euclidian and the homogenous part will cause
enormous numerical problems if such projective points are
used to compute other quantities; e.g. the trifocal tensor
from several point correspondences. These problems can
be avoided easily if the projective entities are shifted and
scaled prior to the computations. This procedure is due
to Hartley, who used this for computing the fundamental
matrix; [Hartley 1995]. He proposes to translate the set of
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004
65
image points in the way that their centroid zc is moved to
the origin and then to scale the translated points isotropi-
cally by m — V2/s, where s is the average distance of the
points from zc.
3 EXPERIMENTAL RESULTS FROM
SYNTHETIC DATA
In [Ressl 2003] the trifocal tensor is computed by different
methods for different image configurations and for a vary-
ing number of point correspondences; all with regard to
nearly coplanar object points. These examples are based
on synthetic data and shall demonstrate
e the differences between minimizing algebraic and re-
projection error,
e the effects of considering or neglecting the internal
constraints of the tensor, and
e the impact of critical configurations; i.e. how close
must points lie to the same plane so that the compu-
tation fails? To answer this question the object points
are placed inside a cuboid, which is then incremen-
tally compressed in one direction till the computation
fails; the compression for which the computation is
still possible will be referred to as minimum thick-
ness of the cuboid.
Five different image configurations: ’Tetra’, ’Airl’ and
‘Air2’ (with strong image geometry), and ‘Street!’ and
'"Street2' (with weak image geometry), see figure 1, are
summarized below. For each image configuration the tri-
focal tensor was computed in five different. ways:
"UCA": The direct linear solution or in other words the
unconstrained solution (with 26 DOF) minimizing al-
gebraic error.
"UCR’: The unconstrained solution (with 26 DOF) mini-
mizing reprojection error realized in the Gauss-Helmert
model. This iterative estimation is initialized by the
"UCA” solution.
CR’: The constrained solution (with 18 DOF) minimiz-
ing reprojection error. This iterative estimation is ini-
tialized by the 'UCA' solution.
"CR: This iterative estimation is identical to 'CR? but
it is initialized by the known true trifocal tensor.
CA’: This is a projection method, which returns that
valid trifocal tensor TFT (with 18 DOF), represented
by the vector q, which lies closest to the "UCA' solu-
tion £; i.e. |TFT(t) — TFT(g)| — min.
For each image configuration the object cuboid is filled
with no — 512 points, which are then projected into the
"Y
x
images and C
is added. From these image points a small sample of k
aussian noise with 1 pixel standard deviation
correspondences is selected, starting from k — 7 (the min-
imum number) up to k — 15. For these samples the tensor