The computed epipolar geometry is then used to refine the
matching process, which is now performed as guided matching
along the epipolar lines. A maximal distance from the epipolar
line is set as threshold to accept a point as potential match or as
outlier. Then the filtering process and the relative orientation
are performed again to get rid of other possible blunders.
However, while the computed epipolar geometry can be correct,
not every correspondence that supports the relative orientation
is necessarily valid. This because we are considering just the
epipolar geometry between couple of images and a pair of
correspondences can support the epipolar geometry by chance
(e.g. a repeated pattern aligned with the epipolar line). These
kinds of ambiguities and blunders can be reduced considering
the epipolar geometry between three consecutive images. A
linear representation for the relative orientation of three views
is represented by the trifocal tensor T [Shashua, 1994]; it is
represented by a set of three 3x3 matrices and is computed only
with image correspondences without knowledge of the motion
or calibration of the cameras. For every triplet of views (Figure
3), 1f pi, p; and p, are corresponding points in the images, then
for every line b through p; in image 2 and for every line I
through p, in image 3, the fundamental trifocal constraint
states:
I; [Tp J1; =0 [3]
where [Tp] is a 3x3 matrix whose (i,j) entry is
[TpiJj = Tx; + Ty; + TV [4]
If we consider only the corresponding points, each triplet pi, p;
and p; must satisfy the matrix equation:
[p;]. [Tp] [ps]. =0 [5]
with [p], the skew-symmetric matrix of an homogeneous
vector, built as
0 a3 a”
ah=| a; 0 -a [6]
n4» aj
where a = (a, a, 23).
If a triplet of points pi, p; and p, satisfy equation (5), it means
that the corresponding points support the tensor T;5, (Figure 4).
L
P pet
^
*e---l----
^
ge
e 0,
Figure 3: Three views geometry: correspondences p; and I;
corresponding to point P and line L
Relation (5) can be used to verify whether image points (or
lines) are correct corresponding features between different
views. Moreover, with constraint (5), it is possible to transfer
points, e.g. compute the image coordinates of a point in the
third view, given the corresponding image positions in the first
two images. The exterior orientation of the cameras is not
required (as with collinearity equations) and only image
measurements are needed. This transfer is very useful when in
one view are not found many correspondences; calling p, and
p» the point correspondences in the fist two images, the image
coordinates of the corresponding point p; in the third view are
given (up to a non zero scalar factor) by:
PP3 = [Tp, [. 7 x2[Tp; B» or
7
tp z [Tp E. - yo[Tp. D. E
where:
[Tp,];« denotes the i" row of [Tp];
P1=[X1> Yi, M.
p.t are non-zero scale factor.
In case of noise-free image measurement, both equations are
equivalent. The same transfer can be done with lines. The point
transfer can be solved also using the fundamental matrix, but
the trifocal constraint can avoid ambiguities and remove
blunders. Moreover, from the tensor it is possible to derive the
fundamental matrices between the first and the third view; e.g.,
given 3 images, M; between image 1 and 3 is given by:
Ma -lel[n. T. Tı]e, [8]
where:
e; is the epipole of image i;
[ei], is the skew-symmetric matrix (6) formed with e;.
Therefore, the transfer of p; is expressed as:
P5 - (Myspi) x (M53p;) [9]
e.g. the intersection of two epipolar lines in the third view.
The 27 unknowns of the tensor T, defined up to a scale factor,
can be computed from at least 7 correspondences: using
equation (5), each correspondence gives 9 equations, 4 of them
linearly independent. In our process, for each triplet of images,
the tensor T is computed with a RANSAC algorithm [Fischler
and Bolles, 1981] using the correspondences that support two
adjacent pair of images and their epipolar geometry. The
RANSAC is a robust estimator, which fits a model (T tensor) to
a data set (triplet of correspondences) starting from a minimal
subset of the data. As result, for each triplet of images, a set of
corresponding points, supporting a trilinear tensor, is available.
After the computation of a T tensor for every consecutive
triplet of images, we consider all the overlapping tensors (Ts,
T234, T345,...) and we look for those correspondences which are
present in consecutive tensors. That is, given two adjacent
tensors Typ. and Tyg With supporting points (XarYas: Xb-Yb> Xe Yo)
and (x'yy'y, X'esY'es x'uy'a), if (xy,ys, XcsYc) in the first tensor is
equal to (x'y,y'y, X'esY'e) in the successive tensor, this means that
the point in images a, b, c and d is the same and therefore this
point must have the same identifier. Each point is tracked as
long as possible in the sequence. The obtained correspondences
are used as tie points for the successive bundle-adjustment.
CAUSE: Character Animation and Ünderstanding from SEquance of images
Figure 4: The relative geometry between a triplet of images
—592—
$mre.5: Extract:
di
| Initial approx
ase of its non-l
ds initial appr
aor orientation:
proach based
jor parameters
4). The vanish
bject space tra
formation of
nin the ima;
quired scene can
» semi-automatic
“inferior parame
sight lines ext
nerging short se
nd distance fror
nteractive ider
üirections;
dssification of
their direction
omputation of
(Collins, 1993)
lomogeneous ci
lines, the cross |
yanishing point;
est fit" vanish:
ad computing
isociated with -
determination 0
fie camera [Car
lt approximatic
mputed using S
"sured on the h
approximation
! Bundle adju:
“ng the proces
(irespondences €
morted in the
ction are im
Rameters [Brow
% camera const:
iets, five paran
litortion and two
Beyer, 1992]. 1