ped
ime
sion
fety
jon
the
s of
ace.
nan
ced
late
the
lled
ace
ble,
25
117
dynamic workspace monitoring system must provide the possibility to arbitrarily and dynamically define a 3-dimensional
separation surface, which is supervised in real time to detect intruding objects. As a matter of fact, a given workspace
will be usable almost simultaneously by humans and robots, where the robot will inspect its workspace for intruding
objects. As long-term objective we envisioned that robots and humans can cooperate almost as easy and secure as people
work together.
Various image processing methods for the supervision of a workspace have been proposed by researchers:
e (Skifstad & Jain, 1989, Baerveldt, 1992) used change detection methods to detect intrusions into a given workspace.
The possible separation surfaces are restricted to conical surfaces along the viewing direction of the camera due to the
lack of 3D information.
e It is possible to define three-dimensional separation-surfaces based on a complete 3-D image of the scene produced by
a passive texture-based sensor. However, this method is very computational intensive since corresponding points must
be searched. Stereo-matching in real time is at the very limit of today's computing power.
e (Gouvianakis et al., 1991) presented a method, where a stereo analysis is only made in regions where a change
occurred. This decreases the amount of correspondence searches. However, changes far away from the zone of interest
or shadows are detected and trigger the 3-D analysis. In industrial environments, where neither the background nor the
lighting can be controlled, this is a considerable drawback.
A flexible, dynamic workspace monitoring system which does not have the above mentioned drawbacks must satisfy the
following requirements:
o The possibility to arbitrarily and dynamically define a 3-dimensional separation surface according to the needs of
humans or robots. Dynamical definition makes space, not needed by the robot, available for human work.
o No hindering of humans or robots at work. The system should provide a maximum of security and the humans
should be able to work as if the system did not exist.
o Detection of intruders in real-time to allow for stopping the robot or changing its movement to avoid collisions.
o Ability to work in an industrial environment. This implies that the system must be robust against external influences
such as varying lighting conditions, shadows due to other machines or workers in the same environment.
2. THE PRINCIPLE
Our method allows to arbitrarily define separation surfaces between humans and robots (figure 1), which in addition can
dynamically be adapted according to the robot’s movements. The method relies on texture based stereo. However, it
differs from the well known texture-based stereo algorithm in that no time-consuming correspondence search is needed.
Using a conventional stereo algorithm the corresponding points must be searched in both images of the stereo pair. This
is usually done by correlating the neighbourhood of two points along the epipolar line (constraint for correspondence
search due to geometric setup of cameras). Given the geometry of the cameras and the image coordinates of two
corresponding points, the 3-D coordinates can be calculated. The most computational intensive part of this method is
the correspondence search.
For a monitoring system it is sufficient to detect objects at the moment they enter the robot’s workspace. Consequently
it is sufficient to know, whether an object is in the separation surface. Thus there is no need for computing a complete
3D description of the scene. In contrast to stereo algorithms, where corresponding points must be searched, we calculate
where the corresponding point is, under the assumption that the object is in the separation surface. Thus the time
consuming correspondence search is replaced by a test of a hypothesis, which is much more efficient.
It is possible to calculate the 3-D coordinates of a point if its disparity and the geometry of the cameras are known.
On the other hand it is possible to calculate the disparity of two corresponding points, if the location of this point in
3-D coordinates is known. Based on this information, the left image is transformed into the right image* under the
assumption that all points are located on the separation surface. The image of objects situated in the separation surface
comes to identical positions in the hypothetical and the real camera image. The disparity between the two regions after
the transformation is called 'differential disparity'. Objects on the separation surface can be found by comparison of the
two images and they are represented by areas that are identical in both images.
The method consists of the following steps:
I. Initialization. All parameters of the cameras and for the image transformation are calculated in advance.
1. Transformation. Producing a hypothetical image, assumed to be seen by the other camera, given the sepa-
ration surface as parameter. Because this transformation depends only on the surface, the coordinates for the
transformation can be calculated in advance.
2. Filtering. Both images are filtered in order to suppress noise and to preprocess the image data for the subsequent
correlation. Finite Impulse Response (FIR) filters are used.
3. Comparison. The two images are compared by correlating each pixel including its neighbourhood with the pixel
at the same location in the other image. The overall computing costs and performance of the system extremely
depend on the chosen correlation method.
* For simplicity reasons we transform consequently from the left to the right image, of course it can also be transformed in the opposite direction.
IAPRS, Vol. 30, Part 5W1, ISPRS Intercommission Workshop "From Pixels to Sequences", Zurich, March 22-24 1995