Full text: Close-range imaging, long-range vision

  
  
a 
preceding iterations and compute for it mean y and standard devia- 
tion c. Empirically we found that 1 is a good threshold for 1 and 0.1 
for a. 
Finally, sub-pixel estimation of the surface is based on a parabola in- 
volving the matching scores of the voxels having a lower and higher 
disparity than the given disparity for a pixel. We found that sub- 
pixel-interpolation improves the smoothness of the result. Yet, eval- 
uation criteria such as the percentage of bad pixels mostly deterio- 
rates while the RMS is only very slightly improved. 
6 VIEW SYNTHESIS WITH THE TRIFOCAL TENSOR 
We use the view synthesis scheme proposed in (Avidan and Shashua, 
1998). The basic idea is to use calibrated imagery together with a 
disparity map. With the latter, points corresponding to given points 
in the first image are obtained for the second image. At least a 
weak calibration is necessary to make a navigation through the im- 
age meaningful for the user as only then rotation matrices and trans- 
lation vectors are defined in a Euclidean sense. 
The trifocal tensor is initially instantiated from the fundamental ma- 
trix 
jk. n jh 
7" = el Ph, 
where &"* is the cross-product tensor and 77; is F in tensor notation. 
Then, the view synthesis is accomplished by modifying the trifocal 
tensor by rotation matrices R (R? in tensor notation) and translation 
vectors £ given by the user. The modified tensor is 
gi* — RET + tad 
where aj is the first part A of the calibrated projection matrix of the 
second camera. 
The actual projection is done based on the optimized scheme pro- 
posed in Section 3. The synthesized image is produced indirectly 
by mapping the pixels via the affine transformation obtained by the 
known coordinates of triangle meshes in the given and the synthe- 
sized image. 
We have obtained results based on calibrated test data of ISPRS 
Working Group V/2 and images courtesy of the Robotvis group at 
INRIA. Figure 5 from the V/2 data set shows the result for the dis- 
parity estimation and view synthesis for the first two views of Figure 
2. One can clearly see the corner structure. Also Figure 6 is from 
the V/2 data set. Finally, in Figure 7 based on images from INRIA 
one can see how the chair, which is relatively close to the camera, 
occludes the background when moving the camera. 
b 
Figure 2: Image triplet with points and corresponding epipolar lines. a) Epipolar lines for b) and c) b) and c) Epipolar lines for a) only 
C 
7 CONCLUSIONS 
In this paper we have presented the estimation of the fundamental 
matrix as well as of the trifocal tensor, an improved novel approach 
for disparity estimation, and the use of the trifocal tensor for view 
synthesis. All results presented have been obtained totally automat- 
ically, without any user interaction. The same parameters have been 
used for all examples. While the estimation of the fundamental ma- 
trix and the trifocal tensor based on pyramids, least squares match- 
ing, and RANSAC works reliable for a wide range of imagery, the 
end-to-end automation of view synthesis still is an intricate problem. 
Especially we still need to improve the disparity estimation. 
Opposed to the determination of the orientation, which is defined 
by very few parameters and is therefore a highly redundant problem, 
disparity estimation aims at determining many parameters. Although 
the approach we are using is relatively sophisticated, the results are 
in many instances unstable and not really good. One way for im- 
provement would be to utilize sophisticated recent approaches based, 
e.g., on graph cuts (Kolmogorov and Zahib, 2002) or on Markov ran- 
dom fields and belief propagation (Sun et al., 2002). Another way 
would be to use more images. This makes it computationally more 
expensive as the simple epipolar geometry cannot be used any more. 
As the most important problem is the determination of approximate 
values, a combination with direct sensors with possibly a lower reso- 
lution such as cheap laser-scanners planned, e.g., for airbag inflation 
control, might be considered for the application domain of video 
communication. 
Finally, new results show that an image can be synthesized by a large 
number of views by matching the gray value profiles on correspond- 
ing lines of view (epipolar lines) (Irani et al., 2002). 
ACKNOWLEDGMENTS 
We thank Peter Krzystek for making us available his code for least 
squares matching. 
REFERENCES 
Avidan, S. and Shashua, A., 1998. Novel View Synthesis by Cas- 
cading Trilinear Tensors. IEEE Transactions on Visualization and 
Computer Graphics 4(4), pp. 293-306. 
Carlsson, S., 1995. Duality of Reconstruction and Positioning from 
Projective Views. In: IEEE Workshop on Representation of Visual 
Scenes, Boston, USA. 
—196— 
F 
  
Fis 
Par 
anc 
PP- 
Föl 
PP- 
Fói 
Pre 
Fe: 
ing 
Te 
14 
Ira 
Lo 
on
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.