Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
ISPRS Commission III, Vol.34, Part 3A „Photogrammetric Computer Vision“, Graz, 2002 
PARALLEL APPROACH TO BINOCULAR STEREO MATCHING 
; Herbert Jahn 
DLR, Institute of Space Sensor Technology and Planetary Exploration, Berlin, Germany 
Herbert.Jahn@dlr.de 
Commission III, WG III/8 
KEY WORDS: Vision Sciences, Stereoscopic Matching, Real-time Processing, Dynamic Networks 
ABSTRACT: 
An approach for parallel-sequential binocular stereo matching is presented. It is based on discrete dynamical models which can be 
implemented in neural multi-layer networks. It is based on the idea that some features (edges) in the left image exert forces on 
similar features in the right image in order to attract them. Each feature point (i,j) of the right image is described by a coordinate 
x(i,j). The coordinates obey a system of time discrete Newtonian equations of motion, which allow the recursive updating of the 
coordinates until they match the corresponding points in the left image. That model is very flexible. It allows shift, expansion and 
compression of image regions of the right image, and it takes into account occlusion to a certain amount. To obtain good results a 
robust and efficient edge detection filter is necessary. It relies on a non-linear averaging algorithm which also can be implemented 
using discrete dynamical models. Both networks use processing elements (neurons) of different kind, i.e. the processing function is 
not given a priori but derived from the models. This is justified by the fact that in the visual system of mammals (humans) a variety 
of different neurons adapted to specific tasks exist. A few examples show that the problem of edge preserving smoothing can be 
solved with a quality which is sufficient for many applications (various images not shown here have been processed with good 
success). A certain success was also achieved in the main problem of stereo matching but further improvements are necessary. 
1. INTRODUCTION 
Real-time stereo processing which is necessary in many 
applications needs very fast algorithms and processing 
hardware. The stereo processing capability of the human visual 
system together with the parallel-sequential neural network 
structures of the brain (Hubel, 1995) lead to the conjecture that 
there exist parallel-sequential algorithms which do the job very 
efficiently. Therefore, it seems to be natural to concentrate 
effort to the development of such algorithms. 
In prior attempts to develop parallel-sequential matching 
algorithms (Jahn, 2000a; Jahn, 2000b) some promising results 
have been obtained. But in some image regions serious errors 
occurred which have led to a new attempt to be presented here. 
If one de-aligns both our eyes by pressing one eye with the 
thumb then one has the impression, as if one of the images is 
pulled to the other until matching is achieved. 
This has led to the idea that prominent features (especially edge 
elements) of one image exert forces to corresponding features in 
the other image in order to attract them. A (homogeneous) 
region between such features is shifted together with the region 
bounding features whereas it can be compressed or stretched, 
because corresponding regions may have different extensions. 
Therefore, an adequate model for the matching process seems to 
be a system of Newtonian equations of motion governing the 
shift of the pixels of one image. Assuming epipolar geometry a 
pixel (i) of the left image corresponds to a pixel (i) of the 
right image of the same image row. If a mass point with 
coordinate x(i',j) and mass m is assigned to that pixel then with 
appropriate forces of various origins acting on that point it can 
be shifted to match the corresponding point (ij). To match 
points inside homogeneous regions, the idea is to couple 
neighboured points by springs in order to shift these points 
together with the edge points. The model then resembles a little 
bit the old model of Julesz which he proposed in (Julesz, 1971) 
for stereo matching. 
To obtain good results a robust and efficient edge detection 
filter is necessary. The filter used here is based on a non-linear 
edge preserving smoothing algorithm which can be 
implemented with the same type of parallel-sequential 
networks, the so-called discrete dynamical networks (Serra, 
Zanarini, 1990) which can be described (in 2D notation) by 
z; ;( +1)= f, (20) PK, (0) (1) 
(i=1..N; j=1...N;) 
Here, z;; is a state vector defined in each image point (i,j) (z(t) 
denotes the matrix of the z;;(t)), K is an external force vector, 
and P is a parameter vector. The initial state z;;(0) is given by a 
feature vector which is derived from the given image data. 
Then, according to (1), the feature vector is updated recursively 
leading to a final state (hopefully a fix point) at t — oo (or 
approximately at. t — t,4,). That final state is the result of the 
image processing task. 
The algorithm (1) is of complexity O(N) (N = N; - N,) if the 
number of iterations is limited (^ £,,,). In each iteration step it 
needs a constant number n of calculations for every image point 
(ij). Then the total number of operations is N - n - fmax- 
Therefore, it is very fast if it is implemented in a multi-layer 
network structure. Here, each neural layer is assigned to a 
discrete time t of (1), and the state of neuron (i,j) in layer t is 
given by z(t). Via the (nonlinear) function fj; each neuron (1,]) 
of layer t+1 is coupled with neurons (k,l) of layer t. 
In chapter 2 algorithm (1) is specified to edge preserving 
smoothing. Then, chapter 3 is dedicated to stereo matching 
within the same framework. Some results are shown. Finally, 
in the conclusions some ideas for future research are presented. 
A - 175
1
2
...
188
189
190
191
192
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user