Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
  
ISPRS Commission III, Vol.34, Part 3A „Photogrammetric Computer Vision", Graz, 2002 
  
higher accuracy is desirable. This can be achieved through a 
least-squares adjustment of the pose parameters. To achieve a 
better accuracy than the extrapolation, it is necessary to extract 
the model points as well as the feature points in the image with 
subpixel accuracy. If this would not be done, the image and 
model points would be separated radially by about 0.25 pixels 
on average if each model point is matched to its closest image 
point. However, even if the points are extracted with subpixel 
accuracy, an algorithm that performs a least-squares adjustment 
based on closest point distances would not improve the accuracy 
much since the points would still have an average distance sig- 
nificantly larger that 0 tangentially because the model and image 
points are not necessarily sampled at the same points and dis- 
tances. Because of this, the proposed algorithm finds the closest 
image point for each model point and then minimizes the sum of 
the squared distances of the image points to a line defined by their 
corresponding model point and the corresponding tangent to the 
model point, i.e., the directions of the model points are taken to be 
correct and are assumed to describe the direction of the object's 
border. If, for example, an edge detector is used, the direction 
vectors of the model are perpendicular to the object boundary, 
and hence the equation of a line through a model point tangent to 
the object boundary is given by t:(x — x:) 4 ui(y — yi) = 0. Let 
qu = (vi,w:)" denote the matched image points corresponding 
to the model points p;. Then, the following function is minimized 
to refine the pose a: 
n 
d(a) = V [t:i(vi(a) — x) + wi(wi(a) ~ 3)” — min. . (9 
az 
The potential corresponding image points in the search image are 
obtained by a non-maximum suppression only and are extrapo- 
lated to subpixel accuracy (Steger, 2000). By this, a segmentation 
of the search image is avoided, which is important to preserve 
the invariance against arbitrary illumination changes. For each 
model point the corresponding image point in the search image 
is chosen as the potential image point with the smallest euclidian 
distance using the pose obtained by the extrapolation to transform 
the model to the search image. Because the points in the search 
image are not segmented, spurious image points may be brought 
into correspondence with model points. Therefore, to make the 
adjustment robust, only correspondences with a distance smaller 
than a robustly computed standard deviation of the distances are 
used for the adjustment. Since (6) results in a linear equation sys- 
tem when similarity transformations are considered, one iteration 
suffices to find the minimum distance. However, since the point 
correspondences may change by the refined pose, an even higher 
accuracy can be gained by iterating the correspondence search 
and pose refinement. Typically, after three iterations the accuracy 
of the pose no longer improves. 
5 EXAMPLE 
Figure 1 displays an example of recognizing multiple objects at 
different scales and rotations. The model image is shown in Fig- 
ure 1(a), while Figure 1(b) shows that all three instances of the 
model have been recognized correctly despite the fact that two of 
them are occluded, that one of them is printed with the contrast 
reversed, and that two of the models were printed with slightly 
different shapes. The time to recognize the models was 103 ms 
on an 800 MHz Pentium III running under Linux. 
6 PERFORMANCE EVALUATION 
To assess the performance of the proposed object recognition sys- 
tem, two different criteria were used: the recognition rate and the 
subpixel accuracy of the results. 
A - 348 
  
    
(b) Found objects 
Figure 1: Example of recognizing multiple objects. Note that the 
model is found despite global contrast reversals and despite the 
fact that two of the models were printed with slightly different 
shapes. 
To test the recognition rate, 500 images of an IC were taken. The 
IC was occluded to various degrees with various objects, so that 
in addition to occlusion, clutter of various degrees was created in 
the image. Figure 2 shows six of the 500 images that were used 
to test the recognition rate. The model was generated from the 
print on the IC in the top left image of Figure 2. On the lowest 
pyramid level it contained 2127 edge points. 
An effort was made to keep the IC in exactly the same position 
in the image in order to be able to measure the degree of occlu- 
sion. Unfortunately, the IC moved very slightly (by less than one 
pixel) during the acquisition of the images. The true amount of 
occlusion was determined by extracting edges from the images 
and intersecting the edge regions with the edges that constitute 
the model. Since the objects that occlude the IC generate clutter 
edges, this actually underestimates the occlusion. 
The model was extracted in the 500 images with smin = 0.3, 
i.e., the method should find the object despite 70% occlusion. 
Only the translation parameters were determined. The average 
recognition time was 22 ms. The model was recognized in 478 
images, i.e., the recognition rate was 95.6%. By visual inspec- 
tion, it was determined that in 15 of the 22 misdetection cases the 
IC was occluded by more than 70%. If these cases are removed 
the recognition rate rises to 98.6%. In the remaining seven cases, 
the occlusion was close to 70%. Figure 3(a) displays a plot of 
the extracted scores against the estimated visibility of the object. 
The instances in which the model was not found are denoted by 
a score of 0, i.e., they lie on the z axis of the plot. Figure 3(b) 
shows the errors of the extracted positions when extrapolating the 
pose as described in Section 3. It can be seen that the IC was acci- 
dentally shifted twice. The position errors are all very close to the 
three cluster centers. Some of the larger errors in the y coordinate 
result from refraction effects caused by the transparent ruler that 
was used in some images to occlude the IC (see the top right im- 
age of Figure 2). Figures 3(c) and (d) display the position errors
1
2
...
361
362
363
364
365
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user