segments we use the Mahalanobis distance to take account
of uncertainty measurements. We will give more details
later in this article. To reduce the research area for neigh-
boring segments we bucketize the pair of images at ta.
The distance of a projected segment to its nearest neigh-
bors is one of the criteria we use to choose the best matches.
Another criterion is what we call the confidence factor,
which is defined as a function of the relative distance be-
tween the two nearest neighbors and represents the rate of
matching disambiguity.
Using these three criteria: distance to the cameras, dis-
tance of the reprojected segment to its nearest neighbor
in the next image, and the confidence factor of the corre-
sponding match, we find the best temporal matches. At
this point we use again the fact that we are tracking pairs
of stereo segments and verify if the candidates for temporal
matching in two cameras verify also the epipolar constraint.
D(t)
ra
iN
cameral camera2
Fig.4.
In this way, even when the estimated motion is un-
certain, we usually find a few pairs of correct temporal
matches, which are sufficient to compute a better estima-
tion of motion parameters.
To update the motion estimation we use a modified ver-
sion of the algorithm presented in [19] and [9]. The input of
this algorithm, other than reconstructed 3D lines, consists
of the 2D velocities of their images on the cameras. Here we
estimate these velocities using two consecutive images (see
section 4). Assuming that we have several (at least two)
non parallel 3D lines undergoing same three-dimensional
motion, this algorithm allows us to recover the full kine-
matic screw of the rigid object they are attached to. Since
we keep track of uncertainty at all levels, a weighted least-
squares minimization which can be implemented with the
Kalman filter is used. Therefore even if there exist a few
incorrect matches we can still obtain a reasonable motion
estimation.
Now we begin the next iteration. At this time we ap-
ply the estimated motion to each reconstructed segment,
reproject it on to the cameras and find their nearest neigh-
bor distances and confidence factors. Based on these values
we decide if we have two good temporal matches and if so,
we then verify the epipolar constraint between the two can-
didates. Finally if the epipolar constraint is also verified
we use this new matches to update the motion estimation.
Experimental results shows that even if we track a few
pair of segments (3 or 4 pairs) in the first iteration, in
the second and the third one we are able to track almost
all stereo pairs of segments . If in the first set of tracked
segments there are some incorrect matches they will be
corrected in the following iterations.
6.2 "Temporal matching of stereo pairs
To track a pair of stereo matching line segments in each
camera:
1) We reconstruct 3D lines. We apply the estimated mo-
tion to them and we get 3D lines predicted for the next
instant which are projected on the cameras.
2) We find the nearest segment to the projected segment
in the next image and choose it as a candidate for our
temporal matching process. To do that :
o Images are bucketized and we search for the neigh-
boring segments only in the buckets which intersect
the projected line segment. In this way many useless
computations are avoided and the algorithm is more
efficient.
o Mahalanobis distance is used to measure the distance
of two line segments. It is very important, because the
projected line segments are obtained from the stereo
reconstruction process and the application of an esti-
mated motion. Through the error analysis of these two
process we are able to determine a covariance matrix
associated to the projected line segments..
For two vectors $4 and S5 with the respective covari-
ance matrices A; and Aj, their Mahalanobis distance
is defined as:
DS, Soy se (8, SA ASS)
We can simply interpret D($;, S5) as the square of the
Euclidean distance between $4, and 55 weighted by the
sum of their covariances.
e We then define a confidence factor for each temporal
match. It is defined as a function of the relative distance
of the projected segment to its first and second nearest
neighbors. Suppose that S; and Ss are respectively the
first and second nearest neighbors of S;. The confidence
factor c is defined as:
,.—. D(5, $9) — D(S1, $3)
D($1, $3)
If the confidence factor of a temporal matching is small
(in our implementation less than 0.8), its means that
there is an ambiguity and we reject the candidate even
if its distance to its nearest neighbor is very small.
e Finally, for the pairs of the candidate segments obtained
through temporal matching, we check if they verify the
epipolar constraint. It means that we compute the epipo-
lar line of the midpoint of one segment (for example in
the first camera) and verify if it intersects the other seg-
ment (in the second camera). If the epipolar constraint
is also verified, we confirm these temporal matches.
6.3 Second Step: Stereo matching
In the initialization phase of the algorithm we use a trinocu-
lar hypothesis-verification stereovision algorithm [3] to find
the stereo matches.