×

You are using an outdated browser that does not fully support the intranda viewer.
As a result, some pages may not be displayed correctly.

We recommend you use one of the following browsers:

Full text

Title
Close-range imaging, long-range vision






).5m mesh)

ee-dimensional city
extracting building
ethod, International
ote Sensing, Vol.33,
XXXIV, Part 3/W4,
JVANCED DTM
DAR DATA”,
ments for Computer
er Guide, AutoDesk
jraphics File Formats
AUTOMATIC HIERARCHICAL OBJECT DECOMPOSITION
FOR OBJECT RECOGNITION
M. Ulrich *", A. Baumgartner", C. Steger”
“ Chair for Photogrammetry and Remote Sensing, Technische Universität München, Arcisstr. 21
,
80290 München, +49 89 289-22671, Www.remotesensing.de,
(markus.ulrich, albert.baumgartner)@bv.tum.de
? MVTec Software GmbH, Neherstr. 1, 81675 München, +49 89 457695-0, www.mvtec.com,
(ulrich,steger)@mvtec.com
KEYWORDS: Object Recognition, Real-time, Hierarchical Model, Computer Vision, Industrial Application
ABSTRACT:
Industrial applications of 2D object recognition such as quality control often demand robustness, highest accuracy, and real-time
computation from the object recognition approach. Simultaneously fulfilling all of these demands is a hard problem and has recently
drawn considerable attention within the research community of close-range photogrammetry and computer vision. The problem is
complicated when dealing with objects or models consisting of several rigid parts that are allowed to move with respect to each
other. In this situation, approaches searching for rigid objects fail since the appearance of the model may substantially change under
the variations caused by the movements. In this paper, an approach is proposed that not only facilitates the recognition of such parts-
based models but also fulfills the above demands. The object is automatically decomposed into single rigid parts based on several
example images that express the movements of the object parts. The mutual movements between the parts are analyzed and
represented in a graph structure. Based on the graph, a hierarchical model is derived that minimizes the search effort during a
subsequent recognition of the object in an arbitrary image.
1. INTRODUCTION
Object recognition is part of many computer vision applications.
It is particularly useful for industrial inspection tasks, where
often an image of an object must be aligned with a model of the
object. The transformation (pose) obtained by the object
recognition process can be used for various tasks, e.g., pick and
place operations, quality control, or inspection tasks. In most
cases, the model of the object is generated from an image of the
object. Such pure 2D approaches are frequently used, because it
usually is too costly or time consuming to create a more
complicated model, e.g., a 3D CAD model. Therefore, in
industrial inspection tasks one is typically interested in
matching a 2D model of an object to the image. A survey of
matching approaches is given in (Brown, 1992). The simplest
class of object recognition methods is based on the gray values
of the model and the image (Brown, 1992; Lai and Fang, 1999).
A more complex class of object recognition uses the object’s
edges for matching, e.g., the mean edge distance (Borgefors,
1988), the Hausdorff distance (Rucklidge, 1997), or the
generalized Hough transform (GHT) (Ballard, 1981).
All of the above approaches do not simultaneously meet the
high industrial demands: robustness to occlusions, clutter,
arbitrary illumination changes, and sensor noise as well as high
recognition accuracy and real-time computation. Therefore, we
developed two approaches, a new similarity measure (Steger,
2001), which uses the edge direction as feature, and a
modification of the GHT (Ulrich et. al, 2001), which eliminates
the disadvantages of slow computation, large memory amounts,
and the limited accuracy of the GHT. Extensive performance
evaluations (Ulrich and Steger, 2001), which also include a
comparison to standard recognition methods, showed that our
two novel approaches have considerable advantages.
All of the above mentioned recognition methods have in
common that they require some form of a rigid model
representing the object to be found. However, in several
applications the assumption of a rigid model is not fulfilled.
Elastic or flexible matching approaches (Bajcsy and Kovacic,
1989; Jain et al., 1996) are able to match deformable objects,
which appear in medicine when dealing with magnetic
resonance imaging or computer tomography, for example.
Approaches for recognizing articulated objects are also
available especially in the field of robotics (Hauck et el., 1997).
Indeed, for industrial applications like quality control or in-
spection tasks it is less important to find elastic or articulated
objects, but to find objects that consist of several rigid parts that
show arbitrary mutual movement, i.e., variations in distance and
orientation. These variations potentially occur whenever a
process is split into several single procedures that are — by
intention or not — insufficiently "aligned" to each other, e.g.,
when applying a tampon print using several stamps or when
equipping a circuit board with transistors or soldering points.
An example is given in Figure 1, which shows several prints on
the clip of a pen. The four images illustrate the mutual
movements (variations) of the object parts: the position of the
print on the clip varies and the dark gray part of the print moves
relatively to the light gray part. Clearly, when taking the object
as rigid it may not be found by the recognition approach.
However, when trying to find the individual parts separately the
search becomes computationally expensive since each part must
be searched for in the entire image and the relations between the
parts are not taken into account. This problem can hardly be
solved taking articulated objects into account since there is no
true justification for hinges, but the mutual variations can be
more general. Because the object consists of several rigid parts,
obviously, also elastic objects cannot model these movements.
One possible solution is to generate several models each
representing one configuration of the model parts and to match
all of these models to the image. However, for large variations,
this is very inefficient and not practical considering real-time
computation.
-99—

|
etes Md MM,