2.3. The Object Acquisition Procedure
An image triplet acquired with the above camera system
allows us to retrieve enough information for a 3D
reconstruction of the front side of the scene. In order to
obtain a more complete (full-3D) description of an object,
however, it is necessary to acquire image triples from
many different viewpoints, so that the whole visible
surface of the object will be imaged and reconstructed.
The acquisition procedure will thus consist of a series of
image triplets (frinocular views), each of which is taken
from a different viewpoint. The viewpoint can be changed
by moving either the camera system or the object.
In order to perform the estimation of the camera motion
between different viewpoints in world-coordinates, the
presence of some reference points (fiducial marks)
becomes necessary. These targets are placed in the
scene in such a way that the number of fiducial points
that are visible in all images of two consecutive trinocular
views exceeds a specified minimum. This allows us to
compute the 3D position of all visible targets with respect
to the world reference frame, and to merge the 3D
information extracted from the individual trinocular views.
3. ESTIMATION OF CAMERA MOTION
Several techniques for the camera motion from point-
feature correspondence are available in the literature.
Most of such methods perform motion estimation from
two-dimensional data by applying a rigidity constraint to a
set of matched points on monocular views [1,2,3]. Vectors
from optical centers and corresponding points on the
image planes are, in fact, bound to be coplanar (essential
constraint), which results in a scalar equation for each
pair of corresponding points in different views.
As already anticipated in the Introduction, since the
acquisition system consists of a calibrated set of three
cameras, camera motion estimation can be performed
directly in the three-dimensional space. In fact, for each
trinocular view we can accurately determine the 3D co-
ordinates of the fiducial points, relative to the camera
frame.
As a first step, fiducial marks are located with subpixel
accuracy on the image plane. Point correspondence
between them is then computed by using a stereo-
matching algorithm that exploits epipolar constraints for
reducing the search space of correspondences and
guaranteeing the absence of ambiguities. Finally, the
fiducial points can be re-projected in the 3D space by
using the camera calibration parameters.
Once the 3D co-ordinates of the fiducial points, relative to
the camera system, are retrieved for each image triplet,
we can recover the camera motion as that rigid motion
that best overlaps the two sets of 3D fiducial points that
are being considered. This can be done through a
minimization process that uses the sum of the distances
between corresponding points as a cost function.
The minimization algorithm that determines the motion
parameters is nonlinear as, besides estimating the
translation vector, it computes the Euler angles that best
describe the rotation of the camera system. As a
consequence, in order to prevent the algorithm from
finding undesired local minima, it is of crucial importance
operating a careful selection of the starting point. A
sufficiently accurate estimate of the camera motion can
be obtained through a linear least square algorithm,
508
provided we adopt an affine representation of the rigid
motion itself (translation vector and rotation matrix). One
should keep in mind, however, that a rotation matrix
represents a super-parametrization of a rigid rotation (3x3
matrices are used for describing elements of the three-
dimensional rotation manifold SO(3)), therefore the linear
minimization process generally returns matrices that do
not satisfy the orthogonality constraint. By projecting the
estimated matrix onto SO(3), however, we obtain a
rotation matrix that is accurate enough to be safely used
as a starting point for the non-linear minimization
process.
4. SCENE RECONSTRUCTION
The scene reconstruction procedure is divided into the
following steps:
a) Camera setup and calibration;
b) Estimation of 3D edges for each triplet;
c) 3D localization of the fiducial points for each triplet;
d) Camera motion estimation and conversion of all 3D
edges into world-coordinates;
e) 3D surface interpolation.
After camera setup and calibration, several trinocular
views of the scene are acquired from different viewing
directions (see, for example, Figs. 2 and 5). In the
examples presented in this paper, the change of
viewpoint is obtained by moving object and support.
Reconstruction of 3-D edges: For each trinocular view,
a 3D reconstruction of luminance edges is performed.
This is done through detection, matching and back-
projection of all visible edges of the scene. Luminance
edges are detected by using an optimized version of
Canny's edge detector [7]. The detected edges are then
passed to an edge selector, in order to keep only those
that carry a significant information (e.g. edges that too
short are discarded) and labeled. For each labeled edge,
the stereo-corresponding (homologous) ones in the other
two images are searched on the epipolar space. Notice
that, as the radial distortion is taken into account, the
epipolar lines are actually represented by curves.
Using more than two views dramatically speeds up the
search of homologous edges. Moreover, matching
ambiguities, typical of binocular systems, are overcome
with a proper placement of the third camera.
Due to a different fragmentation of the same luminance
edge in different images, it may happen that a single
edge in one image needs to be matched to several edges
in the others. For this reason, not only is the proposed
edge matching algorithm capable of finding “one-to-one”
correspondences, but it can also handle
correspondences between subsets of edges that are
portions of the same fragmented one.
Once the trinocular edge matching is completed, each
edge triplet is back-projected onto the 3D scene space.
In order to do so, each edge is first approximated by a
chain of line segments (the desired level of accuracy can
be decided by adjusting the average segment length).
For each representative edge triplet, the back-projected
point in the 3D space is determined by selecting the
closest point to the lines that pass through the optical
center and the edge point of each camera. The
procedure returns a list of 3D edges described by their
representative 3D points. All 3D edges, of course, are
relative to the reference frame of its corresponding
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B5. Vienna 1996
trinocu
3-D lo:
view, e
accura
points
project
each t
subset
the rel
identify
referer
triplets
betwee
best o
relative
anothe
consec
coordir
commc
has be
edges
descrip
3-D si
desirak
by a cl
applica
a need
origina
recove
the cor
edges,
is obt:
techniq
plate s,
This
discont
perforn
regions
interpo
informe
discont
The c
informe
the lur
images
Roughl
interpo
would «
this fe»
used
illumine
images
affecte
motion
errors i
Some
present
shaped
been u
procedi
moving