Full text: CMRT09

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009 
STUDY OF SIFT DESCRIPTORS FOR IMAGE MATCHING BASED LOCALIZATION IN URBAN 
STREET VIEW CONTEXT 
David Picard 1 , Matthieu Cord 1 and Eduardo Valle 2 
1 LIP6 UPMC 
Paris 6 
104 avenue du Président Kennedy 
75016 Paris FRANCE 
{david.picard, matthieu.cord} @lip6.fr 
2 ETIS, CNRS, ENSEA, Univ Cergy-Pontoise, 
F-95000 Cergy-Pontoise 
mail @eduardovalle.com 
KEY WORDS: Image, Databases, Matching, Retrieval, Urban, High resolution 
ABSTRACT 
In this paper we evaluate the quality of vote-based retrieval using SIFT descriptors in a database of street view photog 
raphy, a challeging context where the fraction of mismatched descriptors tends to be very high. This work is part of the 
iTowns project, for which high resolution street views of Paris have been taken. The goal is to retrieve the views of a 
urban scene given a query picture. We have carried out experiments for several techniques of image matching, including 
a post-processing step to check the geometric consistency of the results. We have shown that the efficiency of SIFT based 
matching depends largely on the image database content, and that the post-processing step is essential to the retrieval 
performances. 
1 INTRODUCTION 
In this paper, we evaluate the effectiveness of a voting strat 
egy using SIFT descriptors for near-duplicate retrieval of 
urban scenes. We have observed that, compared to previ 
ously repported applications of SIFT (object recognition, 
stereoscopy, etc.) (Lowe, 2003) this context presents the 
challenge of a very high rate of descriptor mismatches, 
due to the complexity of both the scene and the transfor 
mations it might suffer. We have thus, evaluated how dif 
ferent strategies to filter out the false matches can improve 
the effectiveness of retrieval. 
This study is part of the iTowns project, which is about 
defining a new generation of multimedia web tools that 
mixes a broadband 3D geographic image-based browser 
with an image-based search engine 1 . Fig. 1 shows an ex 
ample of pictures taken for the project. 
The first goal of the new type of search engine, is to re 
trieve, in the high-resolution database, the scene correspond 
ing to a given query image. Let us imagine the following 
scenario: a user is looking for information about a restau 
rant in front of him (feedback from patrons, for instance). 
He takes a picture of the restaurant with his phone and send 
it to the iTowns web server. The image is matched on the 
database and the desired information is retrieved and sent 
back to the user. 
In order to accomplish this goal, there is basically three 
steps to perform : 
1. Match the query image with the corresponding scene 
in the database. 
2. Find information associated with the scene and re 
lated to the query. 
1 See http://itowns.ign.fr 
3. Retrieve only relevant information regarding the user 
interests. 
In this paper, we focus on the first part, and consider the 
use of state of the art techniques for near-duplicate image 
matching. Recently, techniques have been developed for 
the detection of copies where transformations between im 
ages are well known (rotation, scaling, global illumination 
change etc). Those techniques involve the extraction of 
points of interest in the images, then the matching of the 
points in the query with the points in the database, and the 
aggregation of the matches for images of the database us 
ing a voting strategy. We try to extend these techniques 
to the matching of images with less constrainted, and thus 
more realistic transformations (change of viewpoint, local 
illumination, etc). 
The paper is organized as follows: the next section intro 
duces keypoint-based image matching. We explain in sec 
tion 3 the strategy used to perform an efficient approximate 
k-NN search in the database in order to associate query 
points with points in the database. Then, we detail in sec 
tion 4 the geometrical consistency used to filter irrelevant 
matches. Experiments are done on two representative sub 
sets of the iTowns collection, and results are shows in sec 
tion 5, before we conclude. 
2 KEYPOINTS BASED IMAGE MATCHING 
The essential elements of keypoint-based image matching 
appeared in (Schmid and Mohr, 1997): the use of points of 
interest, local descriptors computed around those points, a 
dissimilarity criterion based on a vote-counting algorithm, 
and a step of consistency checking on the matches before 
the final vote count and ranking of the results. We use 
the SIFT points of interest (Lowe, 2003) to describe the
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.