r Least-
Cliffs,
cursive
uction.
p. 983-
-Vision
esented
Stereo
idvanced
SEQUENTIAL ESTIMATION IN ROBOT VISION
Armin Gruen, Thomas Kersten
Institute of Geodesy and Photogrammetry
Swiss Federal Institute of Technology
ETH-Hoenggerberg, CH-8093 Zurich, Switzerland
Tel.: +41-1-377 3038, Fax.: +41-1-372 0438
e-mail: Armin@p.igp.ethz.ch
Commission V
ABSTRACT
Highly time-constrained robot vision applications require a careful tuning and optimized interaction of a system's
components hardware, algorithmic complexity, software engineering, and task performance. The high accuracy
processing of full-frame image sequences for image analysis and object space feature positioning is very time
consuming. In both of these processes, sequential estimation algorithms offer valuable alternatives to
simultaneous approaches. This paper introduces an efficient estimation algorithm based on Givens transformations
for use in point positioning and updating camera orientation data. In a test, an easy-to-use standard video camera
has been applied for image frame generation. The results of camera calibration and an accuracy test using a 3-D
testfield are presented. The computing times of sequential point positioning and camera orientation are given and
in part compared to the values for the simultaneous adjustment. This clearly indicates the superior performance of
the sequential procedure.
KEY WORDS: Sequential Estimation, Real-Time, Robot Vision
1. INTRODUCTION
Image sequences play an important role in photogramme-
try, machine vision and robot vision. While in classical
photogrammetry, especially in aerial applications, data ac-
quisition and processing is largely separated, this is not
the case any more in modern applications where non-pho-
tographic sensor technology and digital processing tech-
niques are employed. Fast methods for data reduction are
required, in particular, in highly time-constrained robotics
applications, but are also very often of advantage in less
time-critical machine vision and digital photogrammetric
projects. The classical data reduction process consists of
the two major stages image measurement and 3-D point
positioning. These process components arc in general sep-
arated from each other. In each case, simultancous algo-
rithms can be reformulated into sequential form for better
time performance.
In image processing well-known sequential formulations
exist for incremental convolution operations (used in line-
ar filtering, resampling, image pyramid generation, etc.);
in image analysis they are applied in the pixel location
transformations in orthophoto production (Baltsavias et
al., 1991) and in form of the Kalman filter in the tracking
of line segments in image space (Deriche and Faugeras,
1990).
A well known example is that of on-line triangulation us-
ing sequential estimation techniques in point positioning
with acrial photographs. Here the computational proce-
dure of on-line bundle triangulation is closely tied to the
image coordinate measurement process of a human opera-
tor. The main purpose of this fast sequential estimation is
that of blunder detection at an carly stage of the measure-
ment process with the utilization of quick remeasurement
possibilities and better blunder control capabilities. Im-
portant characteristics of this application are, on the one
hand, the constantly varying size of the state vector (“so-
lution vector” in least squares adjustment terminology) of
bundle adjustment consisting of the exterior orientation
parameters of photographs, the object point parameters
and possibly additional parameters for self-calibration.
On the other hand, the full covariance matrix of all system
parameters is, if at all, only needed at the termination of
the process. A third distinctive characteristic are the high
and typical sparsity patterns of the matrices involved in
the estimation procedure (design matrix of observation
equations and normal equation matrices of least squares). :
Given these system characteristics a number of sequential
estimation algorithms have been compared to each other
in the past. Firstly, the TFU algorithm (Triangular Factor
Update), which updates directly the upper triangle of the
reduced normal equations, was found to perform much
better than the Kalman form of updating both in terms of
computing times and storage requirement (Gruen, 1982,
Wyatt, 1982). Later, the Givens transformations were
found to be superior, in general, to the TFU (Runge, 1987,
Holm 1989) both in computational performance and in the
ease of mechanization and software implementation. In
the meantime the Givens algorithm has been implemented
in a number of systems (Edmundson, 1991, Kersten et al.,
1992). Already in the mid 80's, Gruen (19852) envisioned
semi-automatic or fully automatic digital real-time trian-
gulation systems for the future. We argue nowadays that
machine vision and in particular robot vision could draw
substantial advantages from these sequential approaches.
This fact has been obviously acknowledged by the com-
puter vision community, where, among others, two recent
developments are of particular interest. In Matthies et al.
(1989) the Kalman filter is used to estimate a depth map
from image sequences. Typical for this approach is that