Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
  
  
  
  
  
ISPRS Commission III, Vol.34, Part 3A „Photogrammetric Computer Vision“, Graz, 2002 
  
mine its pose. This problem is especially grave for large models. 
The required accuracy is usually not obtainable, even in low noise 
images, because the discretization of the image leads to edge di- 
rection errors that already are too large for the GHT. 
In all approaches above, the edge image is binarized. This makes 
the object recognition algorithm invariant only against a narrow 
range of illumination changes. If the image contrast is lowered, 
progressively fewer edge points will be segmented, which has 
the same effects as progressively larger occlusion. The similarity 
measures proposed in this paper overcome all of the above prob- 
lems and result in an object recognition strategy robust against 
occlusion, clutter, and nonlinear illumination changes. They can 
be extended to be robust to global as well as local contrast rever- 
sals. 
2 SIMILARITY MEASURES 
The model of an object consists of a set of points p; — (Ti, yi) 
and associated direction vectors di — (ti, wi)”, à = t...,n 
The direction vectors can be generated by a number of different 
image processing operations, e.g., edge, line, or corner extraction, 
as discussed in Section 3. Typically, the model is generated from 
an image of the object, where an arbitrary region of interest (ROI) 
specifies that part of the image in which the object is located. It is 
advantageous to specify the coordinates p; relative to the center 
of gravity of the ROI of the model or to the center of gravity of 
the points of the model. 
The image in which the model should be found can be trans- 
formed into a representation in which a direction vector ex,y = 
(Dog: Weg)” is obtained for each image point (x,y). In the 
matching process, a transformed model must be compared to the 
image at a particular location. In the most general case considered 
here, the transformation is an arbitrary affine transformation. It is 
useful to separate the translation part of the affine transformation 
from the linear part. Therefore, a linearly transformed model is 
given by the points p, — Api and the accordingly transformed 
direction vectors d; — Adi, where 
As discussed above, the similarity measure by which the trans- 
formed model is compared to the image must be robust to occlu- 
sions, clutter, and illumination changes. One such measure is to 
sum the (unnormalized) dot product of the direction vectors of 
the transformed model and the image over all points of the model 
to compute a matching score at a particular point q = (x, y)" of 
the image, i.e., the similarity measure of the transformed model 
at the point q, which corresponds to the translation part of the 
affine transformation, is computed as follows: 
I~" 
DC Q) 
TL 
} > tiv ; + U;W / 
= — i + ; / . 
n iYz+z,,y+y; i" m-c.0V. 
i=1 
If the model is generated by edge or line filtering, and the im- 
age is preprocessed in the same manner, this similarity measure 
fulfills the requirements of robustness to occlusion and clutter. If 
parts of the object are missing in the image, there are no lines 
or edges at the corresponding positions of the model in the im- 
age, i.e., the direction vectors will have a small length and hence 
contribute little to the sum. Likewise, if there are clutter lines or 
edges in the image, there will either be no point in the model at 
the clutter position or it will have a small length, which means it 
will contribute little to the sum. 
A - 346 
The similarity measure (1) is not truly invariant against illumi- 
nation changes, however, since usually the length of the direc- 
tion vectors depends on the brightness of the image, e.g., if edge 
detection is used to extract the direction vectors. However, if a 
user specifies a threshold on the similarity measure to determine 
whether the model is present in the image, a similarity measure 
with a well defined range of values is desirable. The following 
similarity measure achieves this goal: 
1 — (di, Eqtn!) 
Y ub o 
i=1 
dill - lleg+p 
1 
= 2% 
/ / 
Ui Us-Ea^ yp! T Use! yl 
n 2 + ul? , [v3 w? 
i=1 i + z+z},y+y, + zz}, y+y. 
Because of the normalization of the direction vectors, this sim- 
ilarity measure is additionally invariant to arbitrary illumination 
changes since all vectors are scaled to a length of 1. What makes 
this measure robust against occlusion and clutter is the fact that 
if a feature is missing, either in the model or in the image, noise 
will lead to random direction vectors, which, on average, will 
contribute nothing to the sum. 
The similarity measure (2) will return a high score if all the di- 
rection vectors of the model and the image align, i.e., point in the 
same direction. If edges are used to generate the model and im- 
age vectors, this means that the model and image must have the 
same contrast direction for each edge. Sometimes it is desirable 
to be able to detect the object even if its contrast is reversed. This 
is achieved by: 
  
  
l| * Messen I 
In rare circumstances, it might be necessary to ignore even lo- 
cal contrast changes. In this case, the similarity measure can be 
modified as follows: 
LN dd eges?) 
du WM ANLE s (4) 
2 Ta Tero] 
1 " (die + j) 
s = > > Qm eT : (3) 
i=1 
The above three normalized similarity measures are robust to oc- 
clusion in the sense that the object will be found if it is occluded. 
As mentioned above, this results from the fact that the missing 
object points in the instance of the model in the image will on av- 
erage contribute nothing to the sum. For any particular instance 
of the model in the image, this may not be true, e.g., because the 
noise in the image is not uncorrelated. This leads to the unde- 
sired fact that the instance of the model will be found in different 
poses in different images, even if the model does not move in 
the images, because in a particular image of the model the ran- 
dom direction vectors will contribute slightly different amounts to 
the sum, and hence the maximum of the similarity measure will 
change randomly. To make the localization of the model more 
precise, it is useful to set the contribution of direction vectors 
caused by noise in the image to zero. The easiest way to do this 
is to set all inverse lengths 1/||e,,..,; || of the direction vectors in 
the image to 0 if their length ||eg+-p/ || is smaller than a threshold 
that depends on the noise level in the image and the preprocess- 
ing operation that is used to extract the direction vectors in the 
image. This threshold can be specified easily by the user. By this 
modification of the similarity measure, it can be ensured that an 
occluded instance of the model will always be found in the same 
pose if it does not move in the images. 
The normalized similarity measures (2)-(4) have the property 
that they return a number smaller than 1 as the score of a poten- 
tial match. In all cases, a score of 1 indicates a perfect match be- 
tween the model and the image. Furthermore, the score roughly
1
2
...
359
360
361
362
363
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user