image sequence and the known camera parameters
of the stereoscopic camera pair. In a first step the
images are rectified and corresponding points are
searched in each image pair of the sequence. For
each image pixel an estimate of image disparity is
calculated and stored in a disparity map Dy together
with a confidence measure that describes the quality
of the estimate in Cy. The local disparity measure-
ments are merged to physical objects during scene
segmentation and the physical object boundaries are
recorded in a segmentation map Sy. Prior knowledge
of the observed scene as well as human interaction
that guides the segmentation process can be in-
cluded to improve the modeling quality. All measure-
ments of one object are interpolated to smooth object
surfaces andto fill gaps inthe depth map. inthe scene
segmentation and interpolation stage. All information
obtained so far from image pair analysis are fused in
a 3D scene model. The disparity map is converted
into a depth map and a 3D surface description is
derived from the depth measurements. The surface
geometry is represented as atriangular surface mesh
spanned by control points in space. These control
points can be shifted to adapt the surface geometry
throughout the sequence. Not only the scene geome-
try but also the scene surface texture is stored within
the model. It istherefore possible to synthesize realis-
tic looking image sequences (Lk, Rx) from the model
scene using 3D computer graphics methods [Koch,
1990]. A 3D motion estimation algorithm is included
that calculates the motion of the camera and object
motion throughout the scene and allows to fuse mea-
surements from multiple view points. From the model
scene predictions of the measurements (D'k, S'k) can
be calculated togetherwiththe synthesized sequence
(L', R' and used in a feedback loop to further en-
hance the reliability of the measurements. This feed-
back loop improves the 3D scene analysis based on
comparison of the synthesized 2D sequence with the
real image sequence based on the analysis by syn-
thesis principle.
STEREOSCOPIC IMAGE PAIR ANALYSIS
The analysis of a stereoscopic image pair is split into
correspondence analysis and scene segmentation.
The correspondence analysis tries to locally estimate
image plane correspondences while during scene
segmentation image areas that belong to physically
connected regions are identified through similarity
measures and merged to scene objects. In a prepro-
cessing step the image pair is rectified to give an
image pair where the camera axes are parallel and
the cameras are displaced is in horizontal image
plane coordinates only. This image rectification great-
ly simplifies correspondence analysis and the search
space is reduced to parallel horizontal epipolar lines
Correspondence analysis
The correspondence analysis is split in three parts.
First a candidate for a corresponding point must be
428
identified in one image, thenthe corresponding candi-
date inthe other image is searched alongthe epipolar
lines and third the most probable candidate match
between both images is selected based on a quality
criteria. This search is repeated for each candidate,
that is for each pixel. To select candidates the image
grey level gradient G is evaluated. The image gradi-
entis a vector field pointing into the direction of chang-
ing image texture like grey level edges. Only areas
exceeding a minimum image gradient value |G| »
Gmin can be candidates for correspondence. The
quality of the candidate can be estimated when com-
paringthe gradient direction with the search direction.
Edges perpendicular to the search direction can be
located best while edges parallel to the search direc-
tion cannot be located at all. This quality measure C
can be calculated in Eq. (1). Candidates with C4 = 0
can not be estimated there candidates with C4 = 1
have highest confidence in estimation.
0
G-E
IGI
forlGl < Gain
C, (1)
else
The estimation of C4 is carried out for each image
pixel. Each pixel with a gradient quality measure of
C4 » 0 will be selected as candidate. For each candi-
date a small measurement window (typically 11*11
pixel) around the candidate position in one grey level
image is chosen and the corresponding grey level
distribution is searched for in the other image. The
search spaceis reducedto aone-dimensional search
along the epipolar line between minimum and maxi-
mum disparity values derived from the known mini-
mum and maximum scene distance. To select the
most probable corresponding candidate along the
search line, the normalized cross correlation (NCC) is
calculated between the candidates. The most prob-
able candidate pair is the pair with maximum cross
correlation. The disparity value obtained for this can-
didate pair is recorded in a disparity map. The NCC
is additionally used to define the correspondence
quality. Selected corresponding pairs with low NCC
are corresponding points with low confidence. There-
fore a second quality measure Co in Eq. (2) can be
defined that reflects the correspondence measure-
ment confidence. Experiments have shown that can-
didates below a minimum threshhold NCCmin (NCC-
min being approximately 0.7) are most often false
matches that should be discarded. The confidence
quality is therefore defined to be zero below NCC min
and NCC elsewhere.
0
NCC
for NCC < NCC in
else
C; (2)
Both quality measures can be merged to one mea-
sure C, = C4: Co that contains the combined quality
measure for each candidate. From the correspon-
dence analysis two candidate maps are created: a
disparity map contains the most probable disparity
value for each candidate and a confidence map con-