CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
198
Figure 13: Evolution of the precision against the number
of images retrieved.
strategy, but it is still under 5% most of the retrieval. In
overall, all methods failed at finding the relevant scene in
the database.
6 CONCLUSION
In this paper, we have reviewed the use of keypoints based
voting strategy for image matching in the context of the
iTowns project. We have tested different strategies (pair
wise comparison, k-NN search with brute voting, angle dif
ferences refinement, and 2D affine transform estimation)
on two subset of urban scene database.
We have first found that there is no penalty in using an
approximate k-NN search, which is a huge improvement
on the retrieval speed. Even for small datasets like the first
we used, a pair-wise comparison or a linear k-NN search is
not feasible for interactive application.
The second point we have found is that the post-processing
of the voting strategies is essential to the success of the
retrieval. The Ransac refinement is the only one able to
retrieve at least one relevant image within the first five im
ages, which is the main criterion for a user in this kind of
task. A further improvement could be the estimation of
more complexe transformation that are more robust to per
spective changes.
However, overall results largely depend on the database
content. In the case of a small database (which can be
obtained through geolocalization) with well taken pictures
like the first we used, the results are good enough to be
used in the intended application.For the second database,
the quality of the results is very low, making them inade
quate for our applications. This lack of quality might be an
intrinsic characteristic of SIFT when confronted to images
like ours, that contain many problematic features (complex
shadows, trees, branches, etc), which spawn a huge amount
of descriptors with low discriminant power. Those points
increase dramatically the number of false matches, inflat
ing the rank of of non relevant images (such as on Fig. 14,
which has more matches than the relevant images). As im
provement, we suggest a filtering of the database in order
to remove points that are not informative.
To conclude, we consider the extension of keypoints based
method from copy detection to the matching of scene in
difficult context as not successful. We think there is more
work to do both on the descriptors and on the matching
process. We intend to share our databases and groundtruth
with the community in order to allow the benchmarking of
those tasks on difficult images.
Figure 14: False matching between two images after geo
metric consistency check.
REFERENCES
Fischler, M. A. and Bolles, R. C., 1981. Random sample consen
sus: a paradigm for model fitting with applications to image anal
ysis and automated cartography. Commun. ACM 24(6), pp. 381 —
395.
Friedman, J., Bentley, J. L. and Finkei, R. A., 1976. An algroithm
for finding best matches in logarithmic expected time. Technical
report, Stanford, CA, USA.
Indyk, P. and Motwani, R., 1998. Approximate nearest neighbors:
towards removing the curse of dimensionality. In: STOC ’98:
Proceedings of the thirtieth annual ACM symposium on Theory
of computing, ACM, New York, NY, USA, pp. 604-613.
Jegou, H., Douze, M. and Schmid, C., 2008. Hamming em
bedding and weak geometric consistency for large scale image
search. In: A. Z. David Forsyth, Philip Torr (ed.), European Con
ference on Computer Vision, LNCS, Vol. I, Springer, pp. 304-
317.
Kleinberg, J. M., 1997. Two algorithms for nearest-neighbor
search in high dimensions. In: STOC '91: Proceedings of the
twenty-ninth annual ACM symposium on Theory of computing,
ACM, New York, NY, USA, pp. 599-608.
Lowe, D., 2003. Distinctive image features from scale-invariant
keypoints. In: International Journal of Computer Vision, Vol. 20,
pp. 91-110.
Schmid, C. and Mohr, R., 1997. Local grayvalue invariants for
image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 19(5),
pp.530-535.
Valle, E., 2008. Local-Descriptor Matching for Image Identifi
cation Systems. PhD thesis, Univ. Cergy-Pontoise, ETIS, UMR
CNRS 8051. Direction : S. Philipp-Foliguet, M. Cord.
Valle, E., Cord, M. and Philipp-Foliguet, S., 2008. High
dimensional descriptor indexing for large multimedia databases.
In: CIKM '08: Proceeding of the 17th ACM conference on In
formation and knowledge management, ACM, New York, NY,
USA, pp. 739-748.