3b. Beijing 2008
457
A VIEW-GEOMETRIC APPROACH TO VIEW AND OCCLUSION INVARIANT SHAPE
RECOGNITION AND RETRIEVAL
Alper Yilmaz 3 and Gabor Barsai b *
a The Ohio State University, Dept. ofGeod. Science,Photogrammetric Computer Vision Laboratory
The Ohio State University -yilmaz. 15@osu.edu
b Gotmaps? Inc., 2981 Wicklow Rd., Columbus, OH - gabor@gotmaps.biz
KEYWORDS: image interpretation, 2-D feature extraction, computer vision, feature recognition, machine vision
ABSTRACT:
In this paper, we propose a recognition technique for geographic features which are represented as closed contours. Our algorithm
relies on the planar projective geometry between the contours and exploits the properties of the Fourier Transform. One of the
contributions of the proposed method is that it can recognize features acquired from any viewing direction and even partially
occluded. Another contribution is that this method is independent of the starting point of the contour and the digitizing direction. In
addition, this method does not require conjugate, or matching points that are traditionally required in projective geometry. The
experimental results on an in-house developed geographic database and the Brown University shape database show robust
recognition performance.
1 INTRODUCTION
The measurement of shape similarity between two objects is an
essential task in many areas, including object recognition,
classification, mobile mapping, surveillance, trajectory
calculation, event analysis and retrieval (Zhang and Lu, 2001)
(Schenk, 2001) (Noor et al., 2006); and it has received a lot of
attention since the earliest pictures were taken. For example, by
World War I, the opposing sides routinely took aerial photos of
each others’ positions and identified landmarks for intelligence
gathering. The abundance and easy access today to digital
images makes matching especially relevant among the listed
tasks. Ideally, recognition of objects should be projection, scale,
translation and rotation invariant, just as they are in human
vision. This, however, is a very complex problem, since
numerous times an object is occluded and many objects rarely
appear the same twice, due to different camera/observer
positions, variable lighting or object motion. According to
Meyer (Meyer, 1993), the ultimate goal in this regard is to
investigate automatic object recognition in unconstrained
environments by means of outlines of the objects, which we will
refer to as the contours. In this paper, we study the problem of
matching 2-D contours. One of the reasons for the popularity of
contour-based analysis techniques is that edge detection
constitutes an important aspect of shape recognition by the
human visual system (van Otterloo, 1991) (Schenk, 2001). Rui
(Rui et al., 1998), Zahn and Roskies (Zahn and Roskies, 1972),
Zhang and Fiume (Zhang and Fiume, 2002), Wallace and Wintz
(Wallace and Wintz, 1980) use Fourier Descriptors to match
contours. Other methods used to recognize shapes are moment
based and structure based approaches. The advantages of
moments (easy to calculate) are outweighed by their
disadvantages (not intuitive) (Teague, 1980) (Zhang and Lu,
2001) . In particular, it is difficult to correlate high-order
moments with one of these shape features (DeValois and
DeValois, 1980). The representation of curves/contours using
FDs gives a continuous function. Using FDs, a better
reconstruction of the curve/contour can be created than by just
using moments. Using only moments, reconstruction of the
curve is difficult, if not impossible. Belonged (Belongie et al.,
2002) develop a measure called shape context for comparison.
Shapiro (Shapiro, 1979) writes about the structure of shape in
and early work, about how shapes can be defined, and compares
different structure methods. Using structural information is,
however, not efficient when compared to contours, and
structural methods, especially those using graph-like
representations, usually lead to variants of the computation
intensive graph isomorphism algorithm (Shapiro, 1979) (van
Otterloo, 1991) (Zhang and Lu, 2001). For an extended
introduction of these techniques, please refer to any of the
several survey papers by DeValois and DeValois (DeValois and
DeValois, 1980), van Otterloo (van Otterloo, 1991), Loncaric
(Loncaric, 1998) and Veltkamp (Veltkamp, 2001). Many of
these matching methods rely on simple transformations, such as
translation, rotation and scaling. Recognition under more
general transformations, such as affine and projective transform,
however, has not been fully examined, due to the complex
nature of these transforms. A projective transform is also known
as a homography. In a homography, a ratio of ratios or cross
ratio of lengths on a line is the only projective invariant. The
main motivation behind this work is that 2-D homography may
overcome the problem of noise sensitivity and boundary
variations. The Fourier transform, or the Fourier descriptors
(FDs), of the contour are used to represent the curve
parametrically. We propose to use the homography transform
along with the FDs to match contours, applying the digitized
coordinates of the contours. We use FDs, since ideally, shape
representation should be invariant to scale, rotation translation
and starting point, robust to noise, errors, efficient in computing
the representative terms and efficient for use in matching (Rui et
al., 1998), and many of these task are handled effectively by the
Fourier transform. An important contribution of this paper is the
elimination of the requirement of corresponding points. The
paper by Belongie (Belongie et al., 2002) is probably closest in
spirit of this paper, although with a different approach. For this
study, several images of countries, lakes and other features were
digitized from maps, satellite images, silhouettes to obtain the
contours.
The paper is organized as follows: Section 2 describes
projective geometry for a background on homography. Section
3 outlines the methodology used: converting the x and y
coordinates into periodic functions for use by the Fourier
transform