Example
Images
il Models
Parameters
itions
im Search
h Tree
t is used to create a
del Image, Region of
ample images. Thus,
ameters (position and
| each image. These
omponents that form
leading to the final
t and the face are
ey show the same
holds for all initial
> body. They are also
; are built for each of
ind searched in the
Is of the components
the final models. The
ach pair of the rigid
' pose parameters and
represents the model
by the ROI (white
mages that show the
provided.
stored in a fully connected directed graph, where the vertices
represent the object parts and the link between vertices i and j
describes the overall movement of part j relatively to part ;. By
computing the shortest arborescence of the graph we are able to
ascertain a hierarchical search tree that incorporates an optimum
search strategy in the sense that the search effort is minimized.
Finally, the hierarchical model consists of the final models of
the rigid object parts, the relations between the parts, and the
optimum search strategy. Figure 4 shows the search tree and the
corresponding search ranges for each part, which are described
in the relations.
Figure 4. Result of the automatic hierarchical model
generation. The vertices in the search tree correspond to the
reference points (center of gravity) of each final model part.
The search tree represents the optimum search strategy, e.g.,
the left hand is searched relatively to the left arm and not
relatively to the upper body since the search range is smaller.
The relative search ranges for the reference point of each part
are visualized by white rectangles. The orientation Search
range is visualized by white circle sectors.
The hierarchical model can then be used to search the entire
object containing the movable parts in an arbitrary search
image. This is performed by searching the model parts in the
image using the chosen similarity measure. Note that only one
part must be searched within the entire search range, whereas
the remaining parts can be searched in a very limited search
space, which is defined by the relations in combination with the
search tree.
3. DETAILED DESCRIPTION
In this section the single steps of the algorithm, which were
introduced in section 2, are explained in detail.
3.1 Initial Decomposition
In the first step, the object, which is defined by the ROI in the
model image, is initially broken up into small components. This
can be done either automatically or interactively by the user.
The condition the initial decomposition must fulfill is that each
rigid object part must be represented by at least one component;
otherwise the algorithm is not able to split this component later
on and to find the rigid object parts automatically. Therefore, an
over-segmentation should be preferred. However, very small
components fail the property of being unique, but this can be
balanced by our approach (see section 3.2). In our current
implementation, edges are extracted in the model image by ap-
plying a threshold on the Sobel filter amplitude. The connected
components of the edges are treated as individual initial com-
ponents. Small components are either eliminated or affiliated to
neighboring components. In Figure 5 the components are
visualized by different colors.
Other grouping methods or combinations of them could also be
included in our approach: Gestalt psychology has uncovered a
set of principles guiding the grouping process in the visual
domain (Wertheimer,
1923; Koffka, 1935;
Rock and Palmer, 1990).
Computer vision has
taken advantage of these
principles, e.g., in the
field of perceptual ©
organization and J|
grouping (Ullman, 1979; eo t U
Marr, 1982; Witkin and eL» | ec
Tenenbaum, 1983;
Lowe, 1985;). |
[
Figure 5. The initial decomposition
based on image edges results in 18
components.
3.2 Initial Model
Generation and Search
We use an implementation of the similarity measure presented
in (Steger, 2001) as recognition approach to search the initial
components in the example images. This approach uses image
pyramids to speed up the recognition — like most
implementations of conventional object recognition methods.
However, one has to take care of unfavorable scale-space
effects. In scale-space the edges of the model are influenced by
neighboring edges. This is uncritical in most cases when dealing
with large objects since there are still enough edges in the
model left that are not influenced by neighboring edges and
therefore still enable a good match. However, some problems
occur when dealing with small objects, like the initial
components in our example, since the ratio between the number
of model edges and the number of neighboring edges is
becoming small, i.e., the influence of the neighboring edges is
increasing.
In Figure 6, the principle is shown using a 1D gray value
profile, which includes two edges. Only the left edge belongs to
the model whereas the right is a neighboring edge. In scale-
space the disturbance of the model edge caused by the
neighboring edge increases with the degree of smoothing
(sigma). This problem could be avoided if we would not use
image pyramids within the recognition method. However, this
would lead to high computation times that are not suitable.
Therefore, our solution is to extrapolate the gray values at the
model border to the surrounding area to eliminate the disturbing
neighboring edge (cf. Figure 7). Other, more sophisticated,
approaches explicitly model the edges and subsequently
reconstruct the gray values in the surrounding of the edges
(Elder, 1999). These could be incorporated easily.
—101—