SS
IST
<<
ES SSNS SIRNAS
ITS
RR
SS
S (OOo
NW
INES,
SS ESS, S
SES
SSI
ss S x XS 3
Se
SA
NSS
Figure 6. The model edge is Figure 7. By extrapolating the
disturbed by a neighboring gray values of the model the
edge in scale-space. disturbing edge is eliminated.
After the disturbing edges have been eliminated for each com-
ponent a model is built and used to search the components in
each example image using the recognition approach. Thus, we
obtain all poses P; (including parameters position and
orientation) of each component 7 in each example image.
Another problem arises when searching for small components:
The result of the search may not be unique because of self-
symmetries of the components or mutual similarities between
the components. In our example (Figure 5) the left leg, for
instance, is found four times in the first example image (Figure
3): At the true position of the left leg, at the position of the right
leg and each at orientation 0^ and 180°. Consequently, it is
indispensable to solve these ambiguities to get the most likely
pose for each component. Let n be the number of components
and M, the pose of component i in the model image (i=1,...,n).
The pose represented by match k of component i in an example
image is described by EX, where k=1,...,n; and n; is the number
of matches (found instances) of component i in the example
image. We solve the ambiguities by minimizing the following
equation:
Y arg min 3 arg win 0, M1.) — min
i=l k=l..n; j=1 I=l...n;
(1)
Here, V is a cost function that rates the relative pose of match /
of component j to match k of component ; in the example image
by comparing it to the relative pose of the two components in
the model image. The more the current relative pose in the
example image differs from the relative pose in the model
image the higher the cost value. In our current implementation
W takes the difference in position and orientation into account.
This follows the principle of human perception where the
correspondence problem of apparent motion is solved by
minimizing the overall variation (Ullman, 1979).
The consequence of this step is that each component is assigned
at most one pose in each example image.
3.3 Clustering of Components
Since the initial decomposition led to an over-segmentation, we
now have to merge the components belonging to the same rigid
object part to larger clusters by analyzing the pose parameters.
Components that show similar apparent movement over all
example images are clustered together.
We first calculate the pairwise probability of two components
belonging to the same rigid object part. Let M oM, SM 9^),
Mya, y^, d"), Er, y^, d^), and Ej, y^», g;) be
the poses of two components in the model image and in an
example image. Without loss of generality 9", and @", are set
to 0, since the orientations in the model image are taken as
reference. The relative position of the two components in the
model image is expressed by Ax MM, xM, and Ay ey, Mi
The same holds for the relative position AxE and Ay“ in the
example image. To compare the relative position in the model
and in the search image, we have to rotate the relative position
in the example image back to the reference orientation:
Ae] {| costo? sin or | Ax* Q)
Ay —sing? cosgf | Ay”
If the used recognition method additionally returns accuracy
information of the pose parameters, the accuracy of the relative
position is calculated with the law of error propagation.
Otherwise the accuracy must be specified empirically. Then, the
following hypothesis can be stated:
AXE =Ax"
AVE = Ay" (3)
oi = 9;
The probability of the correctness of this hypothesis
corresponds to the probability that both components belong to
the same rigid object part. It can be calculated using the
equations for hypothesis tests as, €.g., given in (Koch, 1987).
This is done for all object pairs and for all example images
yielding a symmetric similarity matrix, in which at row i and
column j the probability that the components i and j belong
together is stored. The entries in the matrix correspond to the
minimum value of the probabilities in all example images. To
get a higher robustness to mismatches the mean or other statisti-
cal values can be used instead of the minimum value. In Figure
8 the similarity matrix for the example of Figure 3 is displayed.
One can see the high
probability that hat and
face belong together and
that the components
Inner Body
Right Foot
Right Arm
1 Hat ;
ar LES Em Ld Rat
t 0 rm a ri eft Arm
0 e y io aram Outer Body
part. Right Arm
; e . Inner Body
Based on this similarity “0
matrix the initial com- “p
s
ac
Left Hand
Right Hand
Left Leg
ponents are clustered
using a pairwise clus-
tering strategy that suc-
cessively merges the Ripe
two entities with the Right Foot
highest similarity until
the maximum of the re-
maining similarities is
Figure 8. The similarity matrix
contains the probabilities that two
components belong to the same
smaller than a prede- rigid object part. The higher the
fined threshold. probability the brighter the entry.
3.4 Final Model Generation and Search
Models for the recognition approach for the newly clustered
components are created and searched for in all example images
as described in section 3.2. This is necessary if we want to avoid
errors that are introduced when taking the average of the single
initial poses of each component within the cluster as pose for
the newly clustered component. However, we can exploit this
information to reduce the search space by calculating
approximate values for the reference point and the orientation
angle of the new component in the example images. After this
step for each rigid object part a model is available and the pose
parameters for each object part in each image are computed.
—102-
3,55. 7
In thi:
16,1
betwe
dio
in ea
syster
that €
syster
respec
descri
orient
coord
in Fig
Figur
ellips
(bold
this e
move
into
pictur
variat
enclo
refere
the 1
symn
Apart
relati
the m
the r
order
that r
defin
that 1
know
wher
enclo
varia
corre
symn
cann
comp
can b
Our :
Searc
only
regio
recta:
searc