Full text: From pixels to sequences

  
238 
Another application is exception handling. There can be object streams (e.g., on conveyor belts) which are 
mixed. The robot vision system knows object models for most of the objects which have to be grasped, but 
every once in à while an unknown object is arriving. Whenever the vision system realizes that it cannot 
recognize the object it invokes an auxiliary system (not relying on object models) which then analyzes the 
scene and derives possible gripping points. Alternatively, the model based system and the auxiliary system 
could continuously work in parallel. The results of the auxiliary system would only be used if the model based 
component fails. 
From the given examples it is evident that techniques for object manipulation without the use of object models 
can be useful in many application areas. Up to now however, little work has been done by the robot vision 
community for the case of scenes consisting of heaps of unmodeled objects. 
The work reported in this paper aims at developing methods to infer from sensor data alone sufficient information 
for a robot to grasp an unmodeled object and take it away. Obviously, when we say that an object is unknown, 
we still tend to make some minimal assumptions, e.g., that the objects are not too soft or elastic, that their 
size and weight is compatible with the properties of the gripper, and that their surfaces are piecewise smooth 
so that they can be modeled at least locally. The robot's vision system must be provided with the capability to 
extract and represent surface patches in its work space. This representation of the scene must be rich enough to 
support segmentation and to find out which patches are likely to belong to one object, to come to a decision as to 
which hypothesized object has the best “grippability” at a certain moment, and finally to take it away without 
collision with other objects. All knowledge needed for this is extracted from the range data. In accordance with 
this outline we are building a complete system, from data acquisition to action, adhering to the paradigm of 
“purposive vision”. 
2 RELATED WORK 
Work on deriving grasps for single unmodeled objects includes [Boissonnat 1982] and [Stansfield 1991]. Bois- 
sonnat describes a method to find stable grasps for a robot gripper without the use of object models, merely 
from an analysis of the object silhouette which is approximated by a polygonal sequence. The grasps are ranked 
by quality which is determined by a criterion having four components. An extension to three-dimensional sil- 
houettes is proposed. Stansfield proposes to grasp single unmodeled objects with a knowledge-based approach 
which draws on theories about human grasping behavior. From range data a representation of the sensed object 
in the form of a set of up to five aspects is generated. This symbolic representation is used by a rule-based 
system to derive a set of possible grasps for the object. The gripper is a three-fingered Salisbury hand. 
Several authors have employed generic object models where the type of the admitted objects is known, but 
where the dimensions of the instances occurring in the scene have to be determined. [Ikeuchi and Hebert 1990] 
describe a vision system for a planetary explorer, supposed to automatically collect rock samples. Pebbles 
which are not touching each other are partially buried in sand, and range images of them are taken. The visible 
surface parts serve to estimate shape and pose parameters of superquadrics. The pebbles are then taken by 
kind of a shovel excavator robot. [Tsikos and Bajcsy 1991] describe a system which is able to remove objects 
from a heap, one by one. The heap is lying on a base plane and single range views and/or intensity images 
are taken from an essentially vertical direction. Thus it is not possible to see vertical or overhanging surfaces. 
However they assume that only convex objects are admitted, more specifically objects from the postal domain, 
i.e., flats, parcels and tubes. This generic model knowledge helps them to interpret the views and to identify 
grasps. [Mulgaonkar et al. 1992] have also been working on a project for the US Postal Service. They tried to 
physically understand object configurations using range images. Generic object models, i.e., boxes and cylinders 
with circular cross sections were used. 
Our work differs from the above in that it deals with heaps of unknown objects which do not need to be concave 
or conform to a generic model. We are using two opposite oblique range sensors whereby it is possible to also 
sense vertical and even overhanging surfaces which is an advantage in terms of descriptive power. 
Our emphasis is on identifying grasping opportunities in the heap rather than objects. This could be termed 
action-based recognition and it is interesting to compare it to function-based object recognition, proposed by 
[Stark et al. 1993]. The latter is (generic) object recognition based on the detection of features in the object 
instance which (after evidence accumulation) allow to identify the function for which people use the object and 
therefore the object class. In our approach we do not recognize object classes but action classes that a robot 
might perform on a certain part of a heap. The means to arrive at the recognition of grasping opportunities is 
accumulation of evidence as in [Stark et al. 1993|. 
IAPRS, Vol. 30, Part 5W1, ISPRS Intercommission Workshop "From Pixels to Sequences", Zurich, March 22-24 1995
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.