Full text: Technical Commission III (B3)

LOCATION DETERMINATION IN URBAN ENVIRONMENT FROM IMAGE 
SEQUENCES 
   
Qingming Zhan ***, Yubin Liang ®°, Yinghui Xiao *° 
* School of Urban Design, Wuhan University, Wuhan 430072, China 
? Research Center for Digital City, Wuhan University, Wuhan 430072, China 
* School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China 
qmzhan@whu.edu.cn; lyb.whu@gmail.com; yhxiaoitc@126.com 
KEY WORDS: Location Determination Problem, Bundle Adjustment, Image Matching, Point Transfer, Pose Estimation, RANS AC 
ABSTRACT: 
Location Determination Problem (LDP) is a classic and interesting problem both for photogrammetry and computer vision 
community: Given an image depicting a set of landmarks with known locations, determine that point in space from which the image 
was obtained. In this paper we try to use image sequences to automatically solve LDP in local Euclidean space in which no 
georeference information is needed. Overlapping image sequences are preferable for matching images obtained in cities. We 
implement a method which can semi-automatically solve LDP in urban scenario with state-of-the-art 3D reconstruction system. 
1. INTRODUTION 
Nowadays Google Maps and other city-scale 3D reconstruction 
systems with street view are widely used for visual exploration 
of cities. Those systems often rely on structured photos captured 
using sensors equipped with GPS and Inertial Navigation Units 
which make post-processing much easier. However, these 
systems only cover large cities and famous avenues attractive to 
tourists. Furthermore, many people do not need absolute 
georeference information in daily vision-related applications 
such as augmented reality. Only location information in local 
space is enough. For example, given an image taken from a 
place (Figure la), one can guess that the photo was taken from a 
window of a nearby building (Figure 1b) according to viewing 
direction of the given image. But it’s difficult to locate the 
precise location. The authors of this paper are interested in such 
a problem: given an image I, locate the place in another image J 
where image I is taken. 
   
Figure la Figure 1b 
Figure 1: Given image and building where the image was taken 
The problem is defined as ‘space resection’ in photogrammetry 
community and ‘pose estimation’ or ‘extrinsic camera 
calibration’ in computer vision community. Extrinsic camera 
calibration is often carried in calibration field using well- 
designed targets/rigs. This is not the case in our problem 
because there're no pre-installed rigs and image I is taken 
arbitrarily. The difference between space resection and pose 
estimation is that the given image points in space resection is 
georeferenced, whereas pose estimation is usually in local 
Euclidean space. Location Determination Problem is a general 
definition both for the photogrammetry and computer vision 
communities: Given a set of m control points, whose 3- 
dimensional coordinates are known in some coordinate frame, 
and given an image in which some subset of the m control 
points is visible, determine the location (relative to the 
coordinate system of the control points) from which the image 
was obtained. (Fischler, Bolles, 1981) presented the well known 
model-fitting paradigm Random Sample Consensus (RANSAC) 
and use model inliers to solve the "perspective-n-point" 
problem (PnP). The PnP problem which is an equivalent but 
mathematically more concise statement of the LDP is originally 
defined in (Fischler, Bolles, 1981) as: Given the relative spatial 
locations of n control points, and given the angle to every pair 
of control points from an additional point called the Center of 
Perspective (CP), find the lengths of the line segments ("legs") 
joining the CP to each of the control points. The aim of the 
Perspective-n-Point problem (PnP) is to determine the position 
and orientation of a camera given its intrinsic parameters and a 
set of n correspondences between 3D points and their 2D 
projections (Moreno-Noguer, Lepetit, Fua, 2007). Therefore, 
the solution to our problem mainly resorts to pose estimation in 
reconstructed local Euclidean space. To automate the process, 
3D reconstruction of the scene depicted in image I should be 
done first. Then the 3D reconstructed scene is used to determine 
the 3D position of the view point of image I. And the calculated 
3D coordinate of view point is projected to image J to visually 
locate the position. 
Nowadays, given a set of overlapping images of a scene shot 
from nearby camera locations, it's easy to create a panorama 
that seamlessly combines the original images and reconstruct 
the 3D scene using extracted correspondences among several 
images. (Fitzgibbon, Zisserman, 1998) presented method that 
could simultaneously localize the cameras and acquire the 
sparse 3D point cloud of the imaged objects using closed or 
open image sequences. (Lowe, 1999; Lowe, 2004) presented 
Scale Invariant Feature Transform (SIFT) operator to extract 
features that are invariant to image scale and rotation, which can 
be used to robustly match images across a substantial range of 
affine distortion and change in 3D viewpoint. (Zhang and 
Koseca, 2006) used SIFT to geo-locate images by finding geo- 
tagged image match in pre-built database. But we don't assume 
geo-location information such as geo-tags in our research for 
generality purpose. (Snavely et al., 2006; Snavely et al., 2008) 
presented state-of-the-art system (called Bundler) that can 
automatically structure large collections of unordered images 
and they have scaled up the Structure From Motion (SFM) 
vision algorithms to work on entire cities (Agarwal et al., 201 1) 
using photographs obtained from image resource website like 
Flickr. 
  
  
    
   
     
    
    
   
    
    
   
   
   
   
   
   
   
   
   
   
   
   
   
   
    
    
  
  
   
     
  
   
    
   
   
  
   
  
   
     
  
  
mig] 
betv 
The 
data 
the 
inte: 
bes 
betv 
not 
eno 
rela 
Bas 
in s 
The 
If v 
loc: 
the 
So 
usit 
ima 
can 
we 
poi 
to 1 
obt 
the 
wh 
Zis 
vie 
ime 
our 
ust 
  
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.