Close-range imaging, long-range vision

ON 
ated in many 
or animation, 
ized hardware 
human bodies 
he calibration 
' least squares 
'econstruction 
brated image 
ee parts: 
> (section 2) 
on 3) 
ce and point 
nimation and 
E). Its goal is 
of characters 
| information 
lera or with a 
human body 
for the time 
cquired. The 
10vements of 
he procedure 
acquire a full 
L of 6) of a 
resolution of 
lera constant 
'onstant. If a 
leo has to be 
fects must be 
ON 
equisites for 
applications 
mation from 
F orientation 
19% century 
Igorithms is 
e orientation 
lel, ie. the 
one. Camera 
models based on perspective collineation have high stability, 
require a minimum of three corresponding points per image and 
a stable optics. On the other hand, projective approaches can 
deal with variable focal length, but need more parameters, a 
minimum of six corresponding points and are quite instable 
(equations need normalization). 
The calibration and orientation process is the core of the 
presented work and is based on a photogrammetric bundle- 
adjustment (section 3.3); the required tie points (image 
correspondences) are found automatically with the following 
steps: 
e interest points extraction from each image; 
e matching of potential feature pairs between adjacent 
images; 
e false matches clearing using local filtering; 
e epipolar geometry computation to refine the matching 
process and remove any outliers; 
* correspondences tracking in all the image sequence. 
In the following section these steps are described. The process 
is completely automated; it is similar to [Fitzigibbon et al., 
1998] and [Roth et al, 2000], but additional changes and 
extensions to these algorithms are presented and discussed. 
  
Fig.1: Three images (out of six) used for the reconstruction 
3.1 Finding image correspondences 
The first step is to find a set of interest points or corners in each 
image of the sequence. Harris corner detector is used. The 
threshold on the number of corners extracted is based on the 
image size. A good point distribution is assured by subdividing 
the images in small patches and keeping only the points with 
the highest interest value in those patches. 
The next step is to match points between adjacent images. At 
first cross-correlation is used and then the results are refined 
using adaptive least square matching (ALSM) [Grün, 1985]. 
The cross-correlation process uses a small window around each 
point in the first image and tries to correlate it against all points 
that are inside a bigger window in the adjacent image. The 
point with biggest correlation coefficient is used as 
approximation for the matching process. The process returns 
the best match in the second image for each interest point in the 
first image. The final number of possible matches depends on 
the threshold parameters of the matching and on the disparity 
between image pairs; usually it is around 40% of the extracted 
points. 
The found matched pairs always contain outliers, due to the 
unguided matching process. Therefore a filtering of false 
correspondences has to be performed. A process based on 
disparity gradient concept is used [Klette et al., 1998]. If Py grr 
and Pgrigur as well as Qyerr and Qricur are corresponding 
points in the left and right image, the disparity gradient of two 
points (P,Q) is the vector G defined as: 
a .1DXP) - D(Q)| [1] 
Des (P,Q) 
where: 
D(P)= (Prerrx - Pricurxs PLerr.y - Pricur,y) is the parallax of 
P, e.g. the pixel distance of P between the 2 images; 
D(Q) = (Querrx - QriauT,x> Querty - Quianr.y) is the parallax of 
Q, e.g. the pixel distance of Q between the 2 images; 
Des = [(PLerr + Prigur)/2, (Queer + Quianr)/2] is the cyclopean 
separator, e.g. the difference between the two midpoints of the 
straight line segment connecting a point in the left image to the 
corresponding in the right one. 
Per == P 
ec = right 
\ 
\D., 
oy À 
Q S 
let Qn 
Figure 2: The disparity gradient between two correspondences 
(P and Q) in image left and right. 
If P and Q are close together in both images, they should have a 
similar parallax (e.g. a small numerator in equation 1). 
Therefore, the smaller the disparity gradient G is, the more the 
two correspondences are in agreement. This filtering process is 
performed locally and not on the whole image, because the 
algorithm can achieve incorrect results due to very different 
disparity values and in presence of translation, rotation, 
shearing and scale between consecutive images. The sum of all 
disparity gradients G of each matched point relative to all other 
neighbourhood matches is computed. Those matches that have 
a disparity gradient sum greater than the median of the sums of 
G are removed. The process removes ca. 8096 of the false 
correspondences. Other possible approaches to remove false 
matches are described in [Pilu, 1997] and [Zhang et al., 1994]. 
The next step performs a pairwise relative orientation and an 
outlier rejection using those matches that pass the previous 
filtering step. Based on the coplanarity condition, the process 
computes the projective singular correlation between two 
images [Niini, 1994], also called epipolar transformation 
(because it transforms an image point from the first image to an 
epipolar line in the second image) or fundamental matrix (in 
case the interior orientation parameters of both images are the 
same) [Faugeras et al., 1992]. The fundamental matrix Mj; is 
defined by the equation: 
pi Mj, p, -0 with pj ={x, yi Ir [2] 
for every pair of matching points p;, p; (homogeneous vectors) 
in image 1 and 2. The epipoles of the images are defined as the 
right and left null-space of M;; and can be computed with 
singular value decomposition of Mj. A point p; in the second 
image lies on the epipolar line I, defined as 1, = M,p; and 
must satisfy the relation p,l, =0. Similarly, 1; = Mp, 
represents the epipolar line in the first image corresponding to 
p» in the second image. The 3x3 singular matrix M can be 
computed just from image points and at least 8 correspondences 
are needed to compute it. Many solutions have been published 
to compute M, but to cope with possible blunders, a robust 
method of estimation is required. In general least median 
estimators are very powerful in presence of outliers; so the 
Least Median of the Squares (LMedS) method is used to 
achieve a robust computation of the epipolar geometry and to 
reject possible outliers [Scaioni, 2001; Zhang et al, 1994]. 
LMedS estimators solve non-linear minimization problems and 
yield the smallest value for the median of the squared residuals 
computed for the data set. Therefore they are very robust in 
case of false matches or outliers due to false localisation. 
—591—
1
2
...
604
605
606
607
608
...
640
641
Full text: Close-range imaging, long-range vision

Access restriction

Copyright

Note to user