2-5-4
sensor mounted on a mobile mapping system could be treated as
an active observer. Accordingly, we may state that mobile
mapping systems provide an optimal experimental platform for
research on active vision. On the other hand, the newly
developed theories and methodologies of active vision offer us
an invaluable tool to Tackle the challenge of automation in
mobile mapping systems.
3.5 Animate Vision
It is argued that the active observer above defined is not truly
active, but only a moving observer. Vision is not perception but
a perception-action cycle. This is leading to another vision
framework known as animate vision (Bajcsy, 1988; Ballard,
1991). "We do not see, we look” represents the philosophy of
this school. Under this framework, the process of vision is not
considered alone, but as part of a global mechanism of an
intelligent system, including cognition and motor processing.
Current research work aims at developing active vision systems
with great visual abilities, such as control of ocular parameters
(e.g., aperture and focus), spatially-varying sensing, and gaze
control (Abbott and Ahuja, 1990; and Burt, 1988). The control
of the viewing parameters gives a stable and robust means for
visual perception. The control of ocular parameters allows the
system to maintain a suitable image quality against the
degradations that often occur during the acquisition process.
The control of gaze is commonly used in binocular camera
heads (Ballard, 1991). This mechanism, called vergence,
consists of bringing and maintaining the two camera axes at a
specified spatial target position, fixation point. This permits the
simplification of the correspondence problem.
Animate vision further facilitates the computational process
regarding 2-D correspondence and 3-D reconstruction to a large
extent. We believe that animate vision theory and methodology
will make a profound impact on the design and development of
a new generation of mobile mapping systems, intelligent data
acquisition and processing systems.
4. AUTOMATED PROCESSING OF MOBILE MAPPING
IMAGE SEQUENCES
The combination of computational motion vision and digital
photogrammetry technologies is the principal methodology used
throughout the research. Great efforts have been placed on the
development and employment of constraints from the mobile
mapping system for design and implementation of the reliable
information extraction methods. It is understood that the way to
resolving an ill-posed vision problem is to exploit any possible
sources of constraints. This is a key to the success of
automation of image processing. The following constraints are
derived and extensively applied to the methods proposed and
developed:
• Stereoscopic and sequential imaging geometry
constraint, multinocular vision methods and stereo-motion
image analysis techniques can be applied by using this
constraint. •
• Image geo-referencing constraint, rigorous epipolar line
information and a direct image-to-scene transformation are
available, since all the images have been geo-referenced in
a global coordinate system.
• Known vehicle ego-motion constraint: the viewer’s
motion trajectory is determined by using GPS/INS
navigation technologies. This information can be used to
develop and optimize a road-network based information
collection approach.
Using the above constraints, methods for information extraction
and image bridging from image sequences are developed.
4.1 Information Extraction
Since a huge volume of image data has been collected by
mobile mapping systems, rapid and accurate extraction of
features of interest from image data is highly desirable during
'post-mission processing. The low efficiency of manual feature
extraction is not compatible with rapid data acquisition by
mobile mapping systems. It is also one of the major
impediments in the development of on-line mapping or real
time mapping systems. For these reasons, the emphasis of this
research is placed on the development of methods for
automatic, and accurate object measurement and feature
extraction using mobile mapping image sequences.
4.1.1 Object Measurement
a. Overview
The main task of mobile mapping systems is to map objects
from images into a spatial coordinate system. The objects could
be footprints of houses, street edges, centerlines, curbs, lane
markers, manholes, culverts, fire hydrants, traffic signs,
telephone booths, electric poles, etc. In order to calculate the 3-
D coordinates of an object from images, at least two conjugate
points in the images need to be determined. He and Novak
(1993) applied an image matching technique to automate such
an object measurement procedure. Since the orientation
parameters of the cameras are known, the corresponding
epipolar lines in a stereo image pair can be computed. Once a
point in the left image is measured manually, the corresponding
point in the right image can be determined by using the
epipolar-line based image matching method. The area-based
cross-correlation criterion was used in this method.
In order to improve the reliability of image matching, edge
features were used to constrain the matching results (Xin,
1996). If a point measured in the left image is on an image edge,
the corresponding point in the right image should be on an
image edge too. However, this constraint is sensitive to the
results of edge detection. This is a typical “chicken and egg”
problem, is often encountered in computer vision research.
Further improvement of the 3-D coordinate accuracy of object
measurement was researched by Li et al. (1996). Their work
focused on the use of multiple images to perform
photogrammetric triangulation. Their results demonstrated that
the final 3-D coordinate accuracy can be greatly improved if
multiple corresponding points in image sequences can be used.
It is required that multi-nocular point correspondence be
established first.
In fact, multiple images covering the same object are available
in the VISAT mobile mapping system, that is, multiple
corresponding points can be identified in the image sequences.
The use of such redundant image information would be valuable
not only for enhancing the reliability of image matching, but
also for increasing the 3-D coordinate accuracy of