We assume that we have a GPS-enabled camera roaming a 
scene that is partially (or completely) covered in a 3D model 
database. Sensor imagery is tagged by a time stamp, while the 
GPS sensor allows us to tag each frame with approximate 
position information. Our objective is to determine the camera's 
pose and update the sensor's location. 
Our approach can be characterized as a two step procedure: 
— the first step is the use of an image query-based scheme 
to determine the approximate location and orientation of few 
select anchor frames, and 
— the second step entails the relative orientation of the 
remaining frames (relative to the anchor frames) 
Thus we proceed by determining directly the precise orientation 
parameters of few anchor frames, and then determine minor 
corrections to these parameters in order to express the 
orientation of the intermediate frames. This is visualized in Fig. 
1. Anchor frames may be selected in pre-determined temporal 
intervals (e.g. once every a couple of minutes), or at pre- 
determined spatial intervals (e.g. once every 50 meters). 
Anchor Frame 
Very accurate camera 
orientation information 
oO i puted from 
diffe from previous frame 
Very accurate camera 
orientation information 
Anchor Frame 
Figure 1 Proposed two step approach scheme 
As we can see the proposed scheme has a similarity with the 
MPEG compression standards. In MPEG compression few 
frames in the video sequence are chosen to act as anchor 
frames, and they are compressed as JPEG files. For the rest of 
the frames the MPEG compression scheme saves only changes 
between consecutive frames. Drawing from this MPEG 
philosophy we proceed by computing directly accurate sensor 
position information in few select instances (the equivalent of 
anchor frames). The orientation of intermediate frames is 
recovered by analyzing changes in image content (location, and 
size of object facades in them). 
In figure 2 we can identify the main algorithmic steps of our 
approach. We can identify two clusters of processes, 
corresponding to anchor frame processing (left) and 
intermediate frame pose estimation (right). Our work on anchor 
frame orientation estimation through image queries has been 
presented in some extent in [Georgiadis C. et al, 2002]. Briefly, 
we should mention here that our innovative approach integrates 
image queries with image registration dnd sensor orientation. 
Classic image queries have as a goal to retrieve images from a 
database based on certain image characteristics. In our approach 
WC use image queries to recover sensor orientation information 
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B5. Istanbul 2004 
by comparing abstract metrics of a scene configuration in an 
image to the corresponding configuration in a geospatial 
database. This is complemented by an adjustment of co- 
linearity equations to determine sensor position. Thus we 
integrate image retrieval and orientation estimation in a single 
The advantage of this orientation-by-queries approach to anchor 
frame orientation is that it produces very accurate results, while 
its drawback is that it requires good approximate values in order 
to initialize it. However, this is in accordance with our overall 
assumed modus operandi. As we assume the use of a GPS- 
enabled camera in an urban environment, it is realistic to 
consider that the accuracy of the initial approximations of 
sensor locations is in the order of 3-10 meters. This is 
visualized in Fig., 3, with the big red sphere representing the 
uncertainty of the approximation (the actual location can be 
anywhere within this sphere). 
Enhanced Video capture with 
position and time information 
Object delineation creating rough 
building outlines (blobs) 
| Access to VR model | Interest point extraction 
[ for each building facade 
Single and multi - j 
object queries 
Feature matching among 
} consecutive frames 
Image registration Y 
and exterior orientation 
Computation of the 
transformation — parameters 
for each building façade 
between two consecutive 
Determination of the new 
camera position 
Figure 2 Approach outline 
We already have approximate values for the position of the 
camera by using the GPS sensor, but we don't have any 
information about the rotation angles. The nature of the 
problem (close range applications) makes the whole system 
sensitive to the rotation angles and nóise. We assume that the 
rotation of the camera axis will be near to zero so our problem 
is to find the approximate value just for one rotation angle, 
specifically the rotation angle around the Z axis in a world 
reference system, which basically the azimuth the angle 
between the true north and where our camera is looking.

