In: Paparoditis N., Pierrot-Deseilligny M.. Mallet C.. Tournaire O. (Eds). IAPRS. Vol. XXXVIII. Part ЗА - Saint-Mandé, France. Septeniber 1-3, 2010
On the highest level of the pyramid two images are small
enough to be compared within the whole area. If this step is
successful then we only need to refine the matching results on
the lower pyramid levels. Thus, we concentrate on the high
level image matching in our work.
Image matching methods can be classified into two separate
groups: area-based and feature-based. The area-based methods
(Kaneko et al., 2003; Zheng et al., 1993; Grim, 1985) normally
apply a floating window to compare source and target images
using the correlation technique. The most popular similarity
measures are the sum of squared differences and the normalized
cross-correlation. According to (Zitova et al., 2003), the
straightforward correlation-based approaches are practically
applicable only in case of shift and small rotation between
images due to a huge computational complexity needed for
determination of an arbitrarily rotation angle. The more
advanced group of methods employing Fourier transform,
following the idea of (Reddy et al., 1996), can handle arbitrarily
rotation, but these methods are dealing mainly with the case of
different image scales when one of the images may be
considered as a template, so that the overlap area is quite big.
Feature-based methods (Lowe, 2004; Matas et al., 2002; Flusser
et al., 1994) exploit different kinds of features presented in the
image such as segments, lines, curves and interest points.
Specially constructed descriptors are computed for each feature
in order to be compared and to produce a set of feature pairs
(putative matches). We refer to (Tuytelaars et al., 2008; Li et al.,
2008; Remondino et al., 2006) as exhaustive surveys on feature
detectors and descriptors. Finally, an affine or an epipolar
model is fitted by the use of robust model estimators. Feature-
based methods do not depend on the rotation transform, but
their crucial weak point is robust model fitting. One of the most
popular robust estimators is the RANSAC algorithm (Fischler et
al., 1981). It can handle a significant percent of false putative
matches (outliers) efficiently, but the probability of finding the
correct model decreases rapidly when outlier level exceeds
50%. But this basic algorithm is being constantly improved with
a new modification of RANSAC being published practically
each year. In our work we refer to PROSAC (Chum et al., 2005)
as one of the best modifications for matching purposes (Choi,
2009) that have been proposed up to date. The PROSAC
algorithm can employ information about descriptor’s similarity
which increases the probability of finding a correct model even
if the number of outliers is much higher than 50%. However, as
we show in Section 5, even PROSAC fails to provide enough
robustness in case of very low image overlap.
Meanwhile, some special methods of matching slightly
overlapping images have been already introduced. In the work
(Pires et al., 2004) authors use a special adaptive sliding
window. But this method was tested only on a few samples and
there was no example with a considerable rotation. The paper
(Begg et al., 2004) introduces a brute force method. In order to
estimate image overlap authors compute image similarity for all
possible image position on the highest pyramid level. This
method works only with shift transformation. In our opinion,
the advanced feature-based matching scheme (see Section 5 for
details) outperforms all other existing methods, thus we
compare our algorithm only with this scheme.
In the context of this paper it is important to refer to the method
described in (Xiong et al., 2006) which is based on a voting
procedure. Voting schemes are widely used in computer vision
algorithms. One of the most popular methods is Hough
transform (Hough, 1962) which is very robust to noise and can
be used in very challenging cases of parameters estimation.
Unlike SAC-based methods, adaptability of Hough transform
depends on model complexity. But if it is possible to carry out
the voting procedure efficiently, this approach turns out to be
quite powerful as it is fully deterministic, its runtime does not
depend on the inliers percentage and it localizes the peak in
parameters space more precisely because all points are used, not
only a subsample. The method (Xiong et al., 2006) does not use
descriptors at all. Image matching is divided into two stages:
rotation and scale estimation. On the first step lots of comers
are detected in both images and small image patches
surrounding each comer are extracted. For each patch the main
angle direction is computed using principal component analysis
of pixel intensities. Then the voting procedure is used to get the
rotation angle. For this purpose the differences between each
pair of corners are summarized into angular bins. The maximum
of histogram defines the angle between images. This method
works only with heavily overlapped images but the proposed
voting idea is very promising. We also point to the papers (Seo
et al., 2003; Seedahmed et al., 2003) in which very similar
matching approaches are proposed.
3. IMAGE MATCHING OUTLINE
In this paper we consider the feature-based image matching
scheme that consists of the following consecutive steps:
1) Putative matching
2) Shift-rotation model estimation
3) Overlap detection
4) Rematching in overlap area
5) Final model estimation
Although this scheme seems to be quite obvious, we have not
seen it mentioned in the literature. In this paper we focus on the
second step, which is discussed in detail in the next section.
Putative matching and model estimation are the two basic steps
of feature-based image matching process. But in case of low
overlap, when the percentage of true point matches is low, very
few points are usually left after model estimation. Even if those
points are inliers the estimated model may be too rough to use it
on the lower levels of image pyramid, especially if the desired
model is a complex one (e.g. a fundamental matrix). In this case
one can first estimate the overlapping region by computing a
simple shift-rotation model and then repeat the matching
procedure only within that region. This produces more
matching points uniformly distributed along the overlap area
and, as a result, a complex model can be estimated robustly and
precisely. Thus, in our work we concentrated on creating a
robust shift-rotation model estimation algorithm.
4. A VOTING SCHEME FOR SHIFT-ROTATION
MODEL ESTIMATION
The main contribution of this paper is a new method for shift-
rotation model estimation based on a voting scheme. Unlike the
algorithm in (Xiong et al., 2006), the proposed method takes
putative matches as an input. As we deal with the highest level
of the image pyramid, it is possible to carry out an efficient
putative matching procedure before running model estimation
(about the way we obtained putative matches - see Section 5).
The workflow of our algorithm is divided into two stages. At
the first stage the rotation angle between the images is