collection and
itasets at quasi
ery using a 3
via an image
anchor frames
te approach to
R model. This
> then refined
on parameters
ormed for few
vide accurate
or subsequent
mining their
so we use 2-D
frames, and
mputed using
| parameters to
ps comprise à
that meets the
> applications.
uting relative
1 [Simon and
in which they
vious image of
veen these two
, compute the
rence frames,
methods. Our
from only one
he information
used to update
dure includes
tric content, in
or deletion of
¢ approach for
nes.
we present an
Is approach. In
comparing the
or orientation
»monstrate the
th future work
2. MOBILE IMAGE ORIENTATION: GENERAL
OVERVIEW
We assume that we have a GPS-enabled camera roaming a
scene that is partially (or completely) covered in a 3D model
database. Sensor imagery is tagged by a time stamp, while the
GPS sensor allows us to tag each frame with approximate
position information. Our objective is to determine the camera's
pose and update the sensor's location.
Our approach can be characterized as a two step procedure:
— the first step is the use of an image query-based scheme
to determine the approximate location and orientation of few
select anchor frames, and
— the second step entails the relative orientation of the
remaining frames (relative to the anchor frames)
Thus we proceed by determining directly the precise orientation
parameters of few anchor frames, and then determine minor
corrections to these parameters in order to express the
orientation of the intermediate frames. This is visualized in Fig.
1. Anchor frames may be selected in pre-determined temporal
intervals (e.g. once every a couple of minutes), or at pre-
determined spatial intervals (e.g. once every 50 meters).
Anchor Frame
Very accurate camera
orientation information
oO i puted from
diffe from previous frame
Very accurate camera
orientation information
Anchor Frame
Figure 1 Proposed two step approach scheme
As we can see the proposed scheme has a similarity with the
MPEG compression standards. In MPEG compression few
frames in the video sequence are chosen to act as anchor
frames, and they are compressed as JPEG files. For the rest of
the frames the MPEG compression scheme saves only changes
between consecutive frames. Drawing from this MPEG
philosophy we proceed by computing directly accurate sensor
position information in few select instances (the equivalent of
anchor frames). The orientation of intermediate frames is
recovered by analyzing changes in image content (location, and
size of object facades in them).
In figure 2 we can identify the main algorithmic steps of our
approach. We can identify two clusters of processes,
corresponding to anchor frame processing (left) and
intermediate frame pose estimation (right). Our work on anchor
frame orientation estimation through image queries has been
presented in some extent in [Georgiadis C. et al, 2002]. Briefly,
we should mention here that our innovative approach integrates
image queries with image registration dnd sensor orientation.
Classic image queries have as a goal to retrieve images from a
database based on certain image characteristics. In our approach
WC use image queries to recover sensor orientation information
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B5. Istanbul 2004
by comparing abstract metrics of a scene configuration in an
image to the corresponding configuration in a geospatial
database. This is complemented by an adjustment of co-
linearity equations to determine sensor position. Thus we
integrate image retrieval and orientation estimation in a single
step.
The advantage of this orientation-by-queries approach to anchor
frame orientation is that it produces very accurate results, while
its drawback is that it requires good approximate values in order
to initialize it. However, this is in accordance with our overall
assumed modus operandi. As we assume the use of a GPS-
enabled camera in an urban environment, it is realistic to
consider that the accuracy of the initial approximations of
sensor locations is in the order of 3-10 meters. This is
visualized in Fig., 3, with the big red sphere representing the
uncertainty of the approximation (the actual location can be
anywhere within this sphere).
Enhanced Video capture with
position and time information
Ÿ
Object delineation creating rough
building outlines (blobs)
| Access to VR model | Interest point extraction
[ for each building facade
Single and multi - j
object queries
Feature matching among
} consecutive frames
Image registration Y
and exterior orientation
Computation of the
transformation — parameters
for each building façade
between two consecutive
frames
Determination of the new
camera position
Figure 2 Approach outline
We already have approximate values for the position of the
camera by using the GPS sensor, but we don't have any
information about the rotation angles. The nature of the
problem (close range applications) makes the whole system
sensitive to the rotation angles and nóise. We assume that the
rotation of the camera axis will be near to zero so our problem
is to find the approximate value just for one rotation angle,
specifically the rotation angle around the Z axis in a world
reference system, which basically the azimuth the angle
between the true north and where our camera is looking.