Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
approach 
es. The 
'elatively 
pe-from- 
on look- 
position 
of a nor- 
sity pat- 
r the in- 
ing point 
1 will not 
f simple 
hange in 
oint and 
ternative 
ould fo- 
d chang- 
der wide 
ns, con- 
regions 
> part of 
e to take 
nost im- 
it the re- 
ages, i.e. 
his prop- 
(arch for 
1istructed 
selecting 
natch by 
nage un- 
lere, the 
acted in- 
regions. 
gion ex- 
ne wants 
ail next. 
change. 
rmations 
slations. 
r the ob- 
regions 
1e that a 
|, an ex- 
], strong 
ween the 
lly grow 
> relative 
an in the 
e effects 
ur bands 
ling dif- 
| also be 
ondence 
ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision“, Graz, 2002 
  
  
BE san, 
Figure 2: ‘invariant neighbourhoods’ that were extracted 
for the images in fig. 1. Only regions are shown for which a 
corresponding partner in the other image has been found, 
but the regions in the two images have been extracted with- 
out knowledge about the other image. 
irrespective of these changes and that are extracted inde- 
pendently, every step in their construction ought to be in- 
variant under both the geometric and photometric trans- 
formations just described. A detailed description of these 
construction methods is out of the scope of this paper, and 
the interested reader is referred to papers specialised on 
the subject (Tuytelaars 1999, Tuytelaars 2000). As men- 
tioned before, these constructions allow the computer to 
extract the regions in the two views completely indepen- 
dently. After they have been constructed, they can be 
matched efficiently on the basis of features that are ex- 
tracted from the colour patterns that they enclose. These 
features again are invariant under both the geometric and 
the photometric transformations considered. To be a bit 
more precise, a feature vector of moment invariants is used. 
Fig. 2 shows some of the regions that have been extracted 
for fig. 1. We refer to the regions as ‘invariant neighbour- 
hoods’. Recently, several additional construction methods 
have been proposed by other researchers (Baumberg 2000, 
Matas 2001). 
Also under the wide baseline version of shape-from-video, 
maybe better referred to as ‘shape-from-stills’, one is inter- 
ested in finding correspondences between more than two 
  
Figure 3: Top row: views 1 and 2 of a bookshelf scene, with 
the 47 invariant neighbourhoods that have been matched 
indicated. Bottom row: the 41 matched invariant neigh- 
bourhoods for views 1 and 3 of the same scene. 
images. The previously described wide-baseline stereo 
matching approach is well suited for producing many fea- 
ture matches between pairs of views that may be quite dif- 
ferent. In practice, it actually is far from certain that the 
corresponding feature in another view will also be con- 
structed by the system. Hence, the probability of extracting 
all correspondences for a feature in all views of an image 
set quickly decreases with the amount of views. More- 
over, there is a chance of matching wrong features. For in- 
stance, let us suppose we are given 3 views vi, v» and vs. 
Although the method may find matches between the view 
pair (1, 2) and also between the view pair (1, 3), these two 
sets of matches will often substantially differ and a small 
number of common features between all three views may 
result. Figure 3 shows 3 views and the matches found be- 
tween the pairs (1, 2) and (1, 3). Fig. 4 shows the matches 
that these pairs have in common. Whereas more than 40 
matches were found between the pairs of fig. 3, the number 
of matches between all three views has dropped sharply, to 
only 16. When we consider 4 or 5 views, the situation can 
deteriorate further, and only a few, if any, features may be 
put in correspondence among all the views (even though 
there may be sufficient overlap between all the views). 
Our most recent developments are devoted to counteract 
this problem. The approach is founded on two main ideas. 
Firstly, it is possible to exploit the information supplied 
by a correct match in order to generate many other correct 
matches. Suppose there is a feature A; in view v; which 
is matched to its corresponding feature A, in view v», and 
a feature B, in v4 which could not get matched to its cor- 
responding feature in v» (eg: the corresponding invariant 
neighbourhood B» has not been extracted, or maybe it has 
been extracted but the matching failed). If B4, and A, are 
spatially close and lie on the same physical surface, then 
they will probably be mapped to v» by similar affine trans- 
formations. Hence, we can project B» in v» via the affine
1
2
...
18
19
20
21
22
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user