ISPRS Commission II, Vol.34, Part 3A ,,Photogrammetric Computer Vision“, Graz, 2002
PERFORMANCE COMPARISON OF
2D OBJECT RECOGNITION TECHNIQUES
Markus Ulrich^'^* and Carsten Steger"
“Chair for Photogrammetry and Remote Sensing, Technische Universitát München,
ArcisstraBe 21, 80290 München, Germany - markus.ulrich(g)bv.tu-muenchen.de
PMVTec Software GmbH, Neherstrafe 1, 81675 München, Germany - {ulrich,steger } @mvtec.com
Commission III, Working Group III/5
KEY WORDS: Computer Vision, Object Recognition, Performance Comparison, Industrial Application
ABSTRACT
We propose an empirical performance evaluation of five different 2D object recognition techniques. For this purpose, two novel
recognition methods that we have developed with the aim to fulfill increasing industrial demands are compared to the normalized
cross correlation and the Hausdorff distance as two standard similarity measures in industrial applications, as well as to PatMax® —
an object recognition tool developed by Cognex. Additionally, a new method for refining the object's pose based on a least-squares
adjustment is included in our analysis. After a description of the respective methods, several criteria that allow an objective evaluation
of object recognition approaches are introduced. Experiments on real images are used to apply the proposed criteria. The experimental
set-up used for the evaluation measurements is explained in detail. The results are illustrated and analyzed extensively.
1 INTRODUCTION
Object recognition is used in many computer vision applications.
Itis particularly useful for industrial inspection tasks, where often
an image of an object must be aligned with a model of the object.
The transformation (pose) obtained by the object recognition pro-
cess can be used for various tasks, e.g., pick and place operations
or quality control. In most cases, the model of the object is gener-
ated from an image of the object. This 2D approach is taken be-
cause it usually is too costly or time consuming to create a more
complicated model, e.g., a 3D CAD model. Therefore, in indus-
trial inspection tasks one is usually interested in matching a 2D
model of an object to the image. The object may be transformed
by a certain class of transformations, e.g., rigid transformations,
similarity transformations, or general 2D affine transformations.
The latter are usually taken as an approximation to the true per-
spective transformations an object may undergo.
A large number of object recognition strategies exist. All ap-
proaches to object recognition examined in this paper — possi-
bly with the exception of PatMax® — use pixels as their geo-
metric features, i.e., not higher level features like lines or elliptic
arcs. Since PatMax® is a commercial software tool, a detailed
technical description is not available and therefore no statements
about the used features can be made within the scope of this pa-
per. Nevertheless, we included PatMax® in our evaluation be-
cause it is one of the most powerful commercial object recogni-
tion tools. Thus, we are able to rate the performance of our two
novel approaches not only by comparing them to standard recog-
nition techniques but also to a high-end software product.
Several methods have been proposed to recognize objects in im-
ages by matching 2D models to images. A survey of matching
approaches is given in (Brown, 1992). In most 2D matching ob-
ject recognition implementations the search is usually done in
a coarse-to-fine manner, e.g., by using image pyramids (Tani-
moto, 1981). The simplest class of object recognition methods is
based on the gray values of the model and image itself and uses
normalized cross correlation or the sum of squared or absolute
differences as a similarity measure (see (Brown, 1992) or (Lai
A - 368
and Fang, 1999), for example). A more complex class of object
recognition methods does not use the gray values of the model
or object itself, but uses the object's edges for matching. Two
example representatives of this class are the hierarchical cham-
fer matching (Borgefors, 1988) and the Hausdorff distance (see
(Rucklidge, 1997) or (Olson and Huttenlocher, 1997)). Finally,
another class of edge based object recognition algorithms is based
on the generalized Hough transform (GHT) (Ballard, 1981). Ap-
proaches of this kind have the advantage that they are robust to
occlusion as well as clutter. Unfortunately, the GHT in the con-
ventional form requires large amounts of memory and long com-
putation time to recognize the object.
In this paper our two new approaches are analyzed and their per-
formance is compared to that of PatMax C and two of the above
mentioned approaches. Additionally, our new method for refining
the object’s pose, i.e., improving the accuracy of the transforma-
tion parameters, based on a least-squares adjustment is included
in our evaluation. The analysis of the performance characteristics
of object recognition methods is an important issue. First, it helps
to identify breakdown points of the algorithm, i.e., areas where
the algorithm cannot be used because some of the assumptions it
makes are violated. Second, it makes an algorithm comparable to
other algorithms, thus helping users in selecting the appropriate
method for the task they have to solve. Therefore, in this paper an
attempt is made to characterize the performance of five different
object recognition approaches, which are briefly introduced in the
following section. A more detailed description of the approaches
can be found in the corresponding references or in (Ulrich and
Steger, 2001) and (Ulrich and Steger, 2002), where also the eval-
uation is described more extensively.
2 EVALUATED OBJECT RECOGNITION METHODS
First of all, we introduce some definitions that facilitate the com-
parison between the seven techniques. All recognition methods
have in common that they require some form of representation of
the object to be found, which will be called model below. The