ost
act
ng
es
he
ge
ht
dt
ng
ge
ler
ite
rs
ut-
(window) data; no prior signal conditioning or smoothing
is necessary.
À severe drawback of commonly used edge finding
methods (e.g., all ’classical’ operators) is that they are
purely signal driven and lack scene-descriptive criteria;
they treat ’right’ and ’wrong’ edges, e.g., due to shadows,
equally. Poor performance will usually also result under
the influence of noise or texture, both inevitable in natural
scenes. But even optimized algorithms cannot resolve
ambiguities on the low level, even less so, if they work on
local support only (as on a window). This shows the need
to include more a priori knowledge or to establish some
control mechanisms. In our case the guiding mechanism
for real-time road boundary and object tracking is based
on spatio-temporal scene interpretation utilizing generic
3D geometrical models for the environment and objects,
a known ego-motion model and the laws of central (per-
spective) projection.
Even when considering the relatively simple shape of
two converging road boundaries in the image, there are
many sources of ambiguity and uncertainty under real
world conditions: e.g. there may exist dominant edges
across the road due to shadows, there may be multiple
nearby parallel edges or intermittent stretches without
welldefined boundaries, all additionally blurred due to
vehicle motion (fig.3).
Accepting ambiguity on the low level allows the use of
simple and fast algorithms there (even more so, if only a
fraction of the whole image is processed). Having to
resolve ambiguity or uncertainty then on a higher level
requires that no essential information is withheld or lost
by the low level operations. This, however, will mostly
occur if single, optimal results due to local criteria are
extracted. So, a well balanced approach is necessary to
fine tune the distribution of competence between the signal
driven and the model driven processing levels.
Fig.3: Campus road under difficult conditions
As the proper appearance of the road boundaries in the
image can be easily predicted given the observer's relative
position and the motion state, in the approach used here
local edge extraction is tightly guided and controlled by
the interpretation level; i.e. the interpretation level com-
mands the expected edge direction and location plus some
optional parameters for adapting the algorithm according
to its predictions. In return, it receives a description set of
several edge candidates in the area with the orientation
sought (fig. 3), plus additional ones from potential edges
with similar orientation in a limited sector around the
commanded direction. These are checked against the ex-
pected edge locations, then the best candidates satisfying
the model criteria are selected for updating the state esti-
mates, or they may be rejected at all if falling outside of
some allowed threshold around the reference position.
The core algorithm correlates an image area along a
search path within the window with an ideal step edge as
reference pattern. A very efficient implementation of this
technique on a conventional microprocessor has been
originally given by [Kuhnert 85]. Very similar directional
step edge operators are described in [Canny 86], derived,
however, under optimality aspects with respect to shape
and operator width; computational simplicity and effi-
ciency has been less emphasized in the latter case.
A version of Kuhnert’s algorithm with a significantly
improved interface to the interpretation level is being used
here. It is better adapted to noisy real-world scenes and
applies bar masks’ with up to 32 discrete orientations,
yielding a directional resolution of down to 6 degrees. Up
to four different edge element (edgel) candidates are ex-
tracted per window, so that for the road boundaries a set
of up to 32 edgels per camera may be passed to the
interpretation level for selection and further analysis.
On an Intel 80286 microprocessor (8 MHz/no wait-
states) it takes less than two video cycles (40 ms) to
subsequently analyse two windows (sized 48x48 pixels)
at different locations for three different edge orientations
and to extract a set of edge candidates for each window.
Inthe transputer system this step is performed on one T222
processor within 8 windows.
4. Intelligent navigation using landmarks
With the definition of intelligent behavior of an auton-
omous system geared to making decisions in response to
environmental events it is logical, therefore, that at least
crude understanding of the task domain is a basic require-
ment. In the following section, the evolution from dead
reckoning to path following and finally to landmark
navigation is presented.
Main sensors for the navigation task performed with
'ATHENE' have been precision shaft encoders on both
rear wheels and steering, one rate gyroscope for measuring
the turn rate of the robot and one black and white TV-
camera including an image sequence processing system .
Each of the different sensor types has its specific merits,
depending on the robot's state. The signals of the shaft
encoders are usefull as long as the robot operates on
smooth and well defined surfaces with moderate move-