International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August - 01 September 2012, Melbourne, Australia
MULTI-TEMPORAL AND MULTI-SENSOR IMAGE MATCHING BASED ON LOCAL
FREQUENCY INFORMATION
Xiaochun Liu ^ *, Qifeng Yu*, Xiaohu Zhang *, Yang Shang*, Xianwei Zhu, Zhihui Lei *
a Aeronautical and Astronautical Science and Technology, National University of Defense Technology, Changsha,
Hunan, China
liuxiaochun6799231@gmail.com; yuqifeng@vip.sina.com; zhangxiaohu@vip.163.com; jmgc108@vip,163.com:
jmgc108@vip.163.com; jmgc108@vip.163.com
Commission III
KEY WORDS: Image Matching, Local Average Phase; Local Weighted Amplitude; Local Best-Matching Point; Similarity
Measurement; Local Frequency Information;
ABSTRACT:
Image Matching is often one of the first tasks in many Photogrammetry and Remote Sensing applications. This paper presents an
efficient approach to automated multi-temporal and multi-sensor image matching based on local frequency information. Two new
independent image representations, Local Average Phase (LAP) and Local Weighted Amplitude (LWA), are presented to emphasize
the common scene information, while suppressing the non-common illumination and sensor-dependent information. In order to get
the two representations, local frequency information is firstly obtained from Log-Gabor wavelet transformation, which is similar to
that of the human visual system; then the outputs of odd and even symmetric filters are used to construct the LAP and LWA. The
LAP and LWA emphasize on the phase and amplitude information respectively. As these two representations are both derivative-free
and threshold-free, they are robust to noise and can keep as much of the image details as possible. A new Compositional Similarity
Measure (CSM) is also presented to combine the LAP and LWA with the same weight for measuring the similarity of multi-temporal
and multi-sensor images. The CSM can make the LAP and LWA compensate for each other and can make full use of the amplitude
and phase of local frequency information. In many image matching applications, the template is usually selected without
consideration of its matching robustness and accuracy. In order to overcome this problem, a local best matching point detection is
presented to detect the best matching template. In the detection method, we employ self-similarity analysis to identify the template
with the highest matching robustness and accuracy. Experimental results using some real images and simulation images demonstrate
that the presented approach is effective for matching image pairs with significant scene and illumination changes and that it has
advantages over other state-of-the-art approaches, which include: the Local Frequency Response Vectors (LFRV), Phase
Congruence (PC), and Four Directional-Derivative-Energy Image (FDDEI), especially when there is a low signal-to-noise ratio
(SNR). As few assumptions are made, our proposed method can foreseeably be used in a wide variety of image-matching
applications.
1. INTRODUCTION can be extracted robustly and the feature correspondences are
reliably established, then the feature-based methods can be
Multi-temporal and multi-sensor image matching is an successfully applied [4, 5]. However, for multi-temporal and
inevitable problem arising in a variety of applications, such as
multisource data fusion, change analysis, image mosaic, vision
navigation, and object recognition. Because the reference image
and the searching image differ in relation to time or the type of
sensor, the relationship between the intensity values of the
corresponding pixels is usually complex and unknown. For
instance, the contrasts of the images may differ, or the scenes
may change dramatically over time. In other words, the two
images are not globally correlated. Therefore, multi-temporal
and multi-sensor image matching presents a challenging
problem. Note that we assume that the matching image pairs
have already been registered, hence geometric distortion is not
discussed in this paper.
The current automatic matching techniques generally fall into
two categories: feature-based methods and area-based methods.
Feature-based methods, which are by far the most popular,
utilize extracted features, with the most widely used features
including regions, lines or curves, and points [1-3]. If features
multi-sensor images, it is very difficult to extract common
features that exist in both images because of harsh contrast
changes, different sensors and scene changes. In addition,
because the templates surrounding each feature point are not big
enough, the correct rate of feature correspondences is quite low.
As Figure 1 shows, the reference image is captured by an
infrared camera, whereas the searching image is captured by a
visible light camera. We use the most commonly used feature-
based method, SIFT, to detect and then match the feature points.
From Figure 1, we can easily see that few common features are
detected and only four pairs of points are correctly matched,
which is far from meeting the requirements of the application.
In contrast with the feature-based methods, area-based methods
usually take advantage of much larger template, which means
they are able to tolerate more noise and scene changes. The
area-based methods commonly involve image representation
and similarity measurement [6, 7]. Some common similarity
measurements used in the existing matching algorithms are: (i)
* Corresponding author. This is useful to know for communication with the appropriate person in cases with more than one author.
470
norma
differe
The 1
metho
Sobel,
Using
edges,
invaria
repres
gradie:
multi-¢
Repres
introdi
phase ;
oriente
similar
weight
empha
Theref
unsatis
and ir
represé
congru
amplit
operat
is usi
presen
region
difficu
Log-G
compc
point ¢
matchi
21 L
In this
freque
To pre
the sy
Rather
functi
bandw