nd AyV=y"2- y",
E and Ay" in the
tion in the model
e relative position
entation:
2)
' returns accuracy
acy of the relative
error propagation.
pirically. Then, the
3)
this hypothesis
nponents belong to
lculated using the
n in (Koch, 1987).
ll example images
which at row i and
nts i and j belong
x correspond to the
xample images. To
ean or other statisti-
um value. In Figure
gure 3 is displayed.
2x
© OQ
— ©
o
ght Foot
fo
f
} 0 mm]
1
R
e similarity matrix
robabilities that two
elong to the same
art. The higher the
brighter the entry.
the newly clustered
n all example images
ry if we want to avoid
average of the single
1e cluster as pose for
, we can exploit this
yace by calculating
nt and the orientation
ple images. After this
available and the pose
age are computed.
3.5 Analysis of Relations
In this section the pose parameters of the clustered components,
i.e., rigid object parts, are analyzed and the pairwise relations
between part i and j are derived (where i = 1, ..., n, and j =
LA and n, is the number of object parts). For this purpose,
in each image, the pose of part i defines a local coordinate
system in which the pose of part j is calculated. The angle range
that encloses all orientations of part j in the local coordinate
systems of all images describes the angle variation of part j with
respect to part i. The corresponding position variation is
described by the smallest enclosing rectangle of arbitrary
orientation of the reference points of part j in the local
coordinate systems of all images. The principle is exemplified
in Figure 9.
pw : Angle Variation:
Figure 9. The relation between an object pair (rectangle and
ellipse) is computed from the relative poses in the model image
(bold border) and in three example images (upper pictures). In
this example the rectangle is taken as reference and the relative
movement of the ellipse is computed by transforming the ellipse
into the reference system defined by the rectangle (lower
pictures). The overall relative orientation describes the angle
variation (dark circle sector in the right picture) and the smallest
enclosing rectangle of arbitrary orientation of all ellipse
reference points is taken as position variation (dark rectangle in
the right picture). Note that the ambiguities because of
symmetries of both objects are solved according to (1).
Apart from the angle variation and the position variation the
relation information additionally includes statistical values like
the mean and the standard deviation of the relative angle and of
the relative position. This information is calculated for each
ordered object pair. In order to find an optimum search strategy
that minimizes the entire search effort (cf. section 3.6) we must
define a variation measure that quantifies the search effort Q;;
that must be expended to search part j if the pose of part i is
known. We define the search effort as
Q, zl: hA: (4)
where /; and A; is the length and the height of the smallest
enclosing rectangle, respectively, describing the position
variation of part j relative to part i, and Ag; specifies the
corresponding angle variation. Please note that ©) is not
symmetric, i.e., 2; is not necessarily equal to CX; Since we
cannot expect the example images to cover the variations
completely but only qualitatively, the values for l5, hi; and Agi;
can be adapted by applying a tolerance.
Our strategy is to search a selected root part within the entire
search range and then successively search the remaining parts
only relatively to the parts already found. To do so, the search
region of the part's reference point is described by the enclosing
rectangle transformed to the pose from which the part is
searched. Since the computation time of most recognition
methods increases linearly with O we have to minimize the sum
of the Os that are accumulated during the search to find an
optimum search strategy.
3.6 Selection of the Optimum Search Strategy
Based on the search effort Q for all object parts we are able to
compute the optimum search strategy that minimizes the overall
recognition time by applying graph theory to our problem. We
can interpret the object parts as vertices in a graph where the
directed arc between the vertices i and j is weighted with the
corresponding search effort Q;. Thus, we get a fully connected
directed graph D=(V,A), where V denotes the set of vertices of
size |V|=n, and À the set of arcs of size |A|= n,(n, -1). With each
arc a € À the weight Q; is associated. An arborescence of D is
a subtree of D such that there is a particular vertex called the
root, which is not the terminal vertex of any arc, and that for
any other vertex v; , there is exactly one arc, whose terminal
vertex is v. A spanning arborescence of D is an arborescence
that contains all vertices of D. Thus, the problem of finding an
optimum search strategy is equivalent to finding a spanning
arborescence H=(V,B) of D, such that
M — min (5)
vv; eV
An algorithm for finding the spanning arborescence of
minimum weight in a graph is defined in (Chu and Liu, 1965).
The root vertex can be chosen using different criteria. Since the
root vertex corresponds to the only object part that is searched
for not relatively to another object part, the recognition time of
the online phase strongly depends on the recognition time of the
root part. Therefore, when using the recognition method pre-
sented in (Steger, 2001) large object parts should be preferred to
be the root part since more pyramid levels can be used to speed
up the search. Furthermore, the root part should not be self-
symmetric to avoid ambiguities during the online phase, which
complicate the search. The root part plays another decisive role:
it should be ensured that the root part is always found during the
search in the online phase, since the whole object cannot be
found if the root part is missing or occluded to a high degree. In
practice, these criteria must be balanced.
Figure 4 illustrates the result of the optimum search strategy.
Here, the upper part of the body was selected to be the root part.
Thus, the upper body is searched for in the entire image, the left
arm is searched relatively to the upper body taking the relations
into account (cf. section 3.5), the left hand is searched relatively
to the left arm, etc.
Finally, the hierarchical model consists of the final models of
the rigid object parts (cf. section 3.4), the relations between the
parts (cf. section 3.5), and the optimum search strategy.
4. EXAMPLES
Two short examples are presented to show the high potential of
our novel approach. In the first example we applied it to the
tampon print discussed in section 1. Beside the model image
shown in Figure 1 and the ROI enclosing the complete print on
the clip, we took 20 example images of different pen clips to
decompose the object into its parts, calculate the relations and
derive the search strategy. As initial components the light gray
letter and the four dark gray letters were found. After the clus-
tering step the four dark gray letters were merged to one rigid
part. Therefore, our final hierarchical model combined two rigid
object parts, where the merged part was selected as root part.
The variation of the second part relative to the root part was
—103—