Proceedings International Workshop on Mobile Mapping Technology

li, rongxing
' "• : ' ' 
photogrammetric applications, such as quality control, camera 
calibration, sensor navigation and object reconstruction. 
Reviews of references on algorithms for estimation of 
motion/structure parameters from image sequences have been 
provided by Aggarwal and Nandhakumar (1988), and Huang 
and Netravali (1994). 
3.2 Short Range vs. Long Range (Continuous vs. Discrete) 
Motion Analysis 
Generally, there are two complementary classifications of 
schemes to compute visual motion. The first classifies according 
to the spatio-temporal range over which methods are applicable, 
analogous to the human visual system: (1) short range motion 
(continuous) process and long range motion (discrete) 
process. The second classification distinguishes between the 
fundamentally different processes involved: (2) optical flow 
and correspondence. In fact, the optical flow scheme, which 
uses image gradients to derive image motion, is intrinsically 
restricted t short range, while correspondence or similarity 
matching schemes can be of short range or long range. 
In terms of short range motion analysis, images are taken at 
video rate. Thus the emphasis is generally placed on the 
estimation of the optical flow field between two successive 
frames, or on the direct use of the spatio-temporal derivatives of 
the image brightness. These observations must also be 
combined with a measure of the camera velocity (instead of 
camera displacement) to determine the 3-D structure of objects. 
In long range motion analysis, images are acquired at larger 
time intervals, and a large camera displacement is observed. 
Since the image motion of the features is “large” compared to 
the temporal sampling rate, the eye has to solve the 
correspondence problem, i.e., it has to establish which feature at 
one time instant corresponds to which feature at the next time 
instant. Therefore, in long range motion analysis, a set of 
relatively sparse, distinguishable two-dimensional features, such 
as points, straight lines, curved lines, corners and regions, in the 
successive images is firstly extracted. Secondly, feature 
correspondences are established between consecutive features, 
and finally, the 3-D structure of the object and its relative 
motion with respect to the camera can be determined based on 
the motion of these features. It is worth mentioning that most of 
the research for long range motion analysis has concentrated on 
determining motion estimation and feature correspondences 
over a short image sequence (i.e., two to three images). 
In general, if the scene has many easily identifiable feature 
points or lines, the discrete approach based on feature 
correspondence is suitable. If the surfaces in the scene are 
smooth and have no texture, then the continuous approach 
based on intensity derivatives is better. However, robust and 
accurate computation of feature correspondence and optical 
flow still remains a difficult problem. The optical flow field is 
often corrupted by image noise or occlusion, leading to 
generally poor and unstable results in the 3-D reconstruction. 
Feature correspondence also easily fails in areas where either 
the distortion is large, or the occlusion occurs. Hybrid 
approaches combining both feature correspondence and optical 
flow would be a way to alleviate the above problems (Baker et 
al., 1994; Hanna and Okamoto, 1993; and Navab and 
Zhang, 1994). 
The research showed that optical flow field based approach is 
not suitable for the VISAT images, since the image capture 
. si:! 
interval is about 0.4 second and the camera movement between 
the imaging intervals is large and of the order of 6-10 meters. 
Intuitively, our research falls in the category of long range 
motion analysis. However, compared to the processing of 
monocular image sequences commonly addressed in the 
literature, we are dealing with binocular image sequences. Such 
redundant image information allows us to develop more robust 
algorithms for the processing of image sequences. In this 
research, feature correspondence and image matching 
techniques are mainly used in the proposed methods. There are 
a number of good references available with reviews of 
techniques for feature correspondence and image matching 
(Agouris, 1992; Baltsavias, 1991; Barnard and Fischler, 1982; 
Dhond and Aggarwal, 1989; Forstner, 1993; Gruen, 1994; 
Jones, 1997, and Lemmens, 1988 and Mass, 1996). 
3.3 Visual Motion Analysis with Known Ego-Motion 
Vision analysis with known ego-motion refers to motion 
analysis under known dynamics of the camera (observer). In 
fact, known ego-motion analysis forms the basis of an active 
vision system. Under the condition of known ego-motion, the 3- 
D reconstruction problem can be solved more efficiently. This 
fact has motivated some investigations (Aloimonos et al., 1988; 
Bajcsy, 1988). On the other hand, accurate geometric 
constraints, such as the epipolar line constraint, are also 
available, resulting in a more robust realization of feature 
correspondences. 
In the VISAT mobile mapping system, the kinematic trajectory 
of the vehicle can be determined with a high accuracy of 5-15 
cm, and the camera dynamics can be examined rigorously by 
using GPS/INS georeferencing technique (Schwarz and El- 
Sheimy, 1996). As a result, visual analysis can be conducted 
under the constraint of known ego-motion. It will be seen that 
this constraint is very valuable for automating and optimizing a 
reliable procedure for object measurement and feature 
extraction. 
3.4 Active Vision 
A very important advance in the theoretical framework of 
computer vision is the concept of active vision, proposed by 
Aloimonos et al. (1988). Active vision represents a behaviorism 
school, which is directly opposite to Marr’s theory of vision, a 
recovery school (Marr, 1982). 
There is a noncontroversial observation that vision is an 
underconstrained problem. Thus the main goal of vision work is 
to find and develop constraints. However, rather than focusing 
on narrow sources of constraints, mostly oversimplified 
constraints such as smoothness constraints widely used in the 
recovery school, it is argued that one must exploit constraints 
from all possible sources and incorporate them systematically. 
The basic idea of active vision is the introduction of a new 
source of constraints arising from the internal architecture of the 
system itself and the iteration of its components, such as 
observer-based constraints, e.g., the sensor and/or the computer 
(Jolion, 1994). Under the constraint that the active observer 
moves with known motion, a unique solution is available, 
resulting in a well-posed formulation of the problem. Moreover, 
the knowledge of these viewpoints of the active observer 
increases the robustness to noise. 
The known motion of the observer can be determined by the use 
of advanced navigation technology. In this context, an imaging 
2-5-3 
■ 
3É6
1
2
...
76
77
78
79
80
...
404
405
Full text: Proceedings International Workshop on Mobile Mapping Technology

Access restriction

Copyright

Note to user