DENSE IMAGE MATCHING IN AIRBORNE VIDEO SEQUENCES
M. Gerke
International Institute for Geo-Information Science and Earth Observation - ITC, Department of Earth
Observation Science, Hengelosestraat 99, P.O. Box 6, 7500AA Enschede, The Netherlands, gerke@itc.nl
ICWG III/V
KEY WORDS: Video, Surface, Matching, Resolution
ABSTRACT:
The use of airborne video data is gaining increasingly attention in the photogrammetric community. This interest is driven by the
availability of low-cost sensor platforms like UAV and low-cost sensors such as digital (video) consumer cameras. Moreover, a wide
range of applications are related to this kind of sensor data, e.g. fast mapping in case of disasters, where geometric and semantic
information on a particular scene has to be captured within a small timeframe.
The advantage of video data against wide baseline images is that tracking algorithms can be used to derive highly redundant tie point
information in a fully automatic manner. One drawback is that due to the reduced resolution and only short exposure time, the image
quality is worse compared to the quality provided by mapping cameras. However, the many-fold overlapping enables the use of
multiframe super resolution techniques to obtain higher quality textures.
In this paper the focus lies on the dense surface reconstruction using airborne video sequences. The first step in the approach consists
of retrieving the structure and motion of the cameras, also incorporating geometric knowledge on the scene. In the subsequent step a
dense surface reconstruction is applied. First, appropriate image combinations for the stereo matching are selected. After rectification,
the Semi-Global Matching technique is applied, using the Mutual Information approach for retrieving local energy costs. After the
matches are linked, super resolution images are computed and 3D point clouds are derived by forward intersection.
The results for two datasets show that the super resolution images have a higher nominal resolution than the original ones. As the
accuracy of the forward intersection depends on the actual image acquisition parameters, the unfiltered 3D point cloud could be noisy.
Therefore, some further improvements for the 3D point coordinates are identitied.
1 INTRODUCTION
For many applications dense surface reconstruction from images
is becoming an interesting alternative to laserscanning. In the
context of airborne remote sensing metric digital cameras are
available which are able to acquire high resolution images at high
overlapping ratio. This availability stimulates the development of
sophisticated approaches to dense matching and surface recon
struction (Hirschmiiller et al., 2005, Zebedin et al., 2006). The
advantage over LIDAR in those cases is that besides the deriva
tion of a DSM, further products like (true) orthoimages of high
resultion are computable right away.
The dense surface reconstruction is also interesting in other fields;
in close range applications the focus is on the reconstruction of
single (man-made) objects or even whole cities. In those cases the
high overlapping is often achieved by using video data, see e.g.
(Pollefeys et al., 2004). The advantage of video over single wide-
baseline shots is the high redundancy of observations through the
high overlapping which can be exploited to retrieve correspon
dences and thus camera pose and calibration information through
tracking algorithms (Shape from Motion).
In between those two domains - airborne remote sensing being
primarily used for mapping purposes and video based reconstruc
tion of man-made object - one can find the field of airborne re
mote sensing from low altitude platforms, like helicopters or Un
manned Airborne Vehicles (UAVs) (Eisenbeiss and Zhang, 2006,
Forstner and Steffen, 2007). Due to its flexibility and low costs
for operation, UAVs are interesting for a lot of applications. Using
an UAV equipped with a video camera enables to combine hav
ing an overview on a certain area of interest with the advantages
of using dense image sequences to retrieve geometric and seman
tic information. The challenges one is facing when working with
this kind of data are manifold, e.g. the motion of the vehicle may
not be smooth, and the image scale might be smaller than in the
aforementioned cases, influencing the available accuracy and re
liability.
The focus of this paper is on the implementation of a strategy
for dense image matching in airborne video sequences. The goal
is to derive two datasets: one are so-called super resolution im
ages where the multiple observation of the scene of interest is
exploited to derive noise reduced images with a higher nominal
resolution than the original ones. The second dataset is a dense
3D point cloud as derived from forward intersecting the matched
points. The paper is meant as a case study where known ap
proaches and algorithms are used to set-up a practical workflow
for the processing of airborne video data. The results will show
the potential of the applied techniques, but also reveal some open
issues.
The remainder of this paper is organised as follows: The next
section describes the established workflow to process the data,
including some links to the applied literature. In section 3 some
experiments are described: After the outlining of two different
datasets, the obtained results are shown and evaluated. Some con
clusions from those case studies and an outlook to further work
are given in the last section.
2 WORKFLOW AND METHODS
The workflow as currently realized consists of the following steps
(cf. Figure 1):
1. Structure and motion recovery: After feature tracking across
the sequence the camera matrices are computed through bun
dle adjustment.