Close-range imaging, long-range vision

rapidly from a small number of images taken by tourists. The 
paper concludes with a short discussion in section 7. 
2. SYNOPSIS OF 3D RECONSTRUCTION 
TECHNIQUES 
The ultimate goal of all 3D reconstruction methods is to satisfy 
the eight requirements listed in the previous section. Since this 
is not an easy task, they focus on some of the tasks at the 
expense of the others. We will use this to distinguish between 
methods in order to depict a comparison. The methods may: 
1- Focus on accuracy without any automation. 
2- Focus on full automation. 
3- Try to reach a balance between all requirements. 
The most widely used method remains to be first method, which 
is the traditional approach. This is a labor-intensive endeavor 
where engineering plans or drawings plus surveying and/or 
standard photogrammetry techniques are employed followed by 
importing the measurements into a CAD system to create a 3D 
model. The results are often unsatisfactory in appearance and 
seem computer-generated. Efforts to increase the level of 
automation became essential in order to meet the increasing 
demand for 3D models. However, the efforts to completely 
automate the process from taking images to the output of a 3D 
model, while promising, are thus far not always successful. The 
automation of camera pose estimation, self-calibration, and 
computation of pixel 3D coordinates will be summarized. This 
procedure, which is now widely used in computer vision [e.g. 
Faugeras et al, 1998, Fitzgibbon et al, 1998, Pollefeys et al, 
1999, Liebowitz, et al, 1999], starts with a sequence of images 
taken by un-calibrated camera. The system automatically 
extracts interest points, like corners, sequentially matches them 
across views, then computes camera parameters and 3D 
coordinates of the matched points using robust techniques. The 
key to the success of this fully automatic procedure is that 
successive images may not vary significantly, thus the images 
must be taken at short intervals. The first two images are 
usually used to initialize the sequence. It is important that the 
points are tracked over a long sequence or in every image where 
they appear to reduce the error propagation. This is all done in a 
projective geometry basis and is usually followed by a bundle 
adjustment, also in the projective space. Self-calibration to 
compute the intrinsic camera parameters, usually only the focal 
length, follows in order to obtain metric reconstruction, up to 
scale, from the projective one [Pollefeys et al, 1999]. Again, 
bundle adjustment is usually applied to the metric construction 
to optimize the solution. The next step, the creation of the 3D 
model, is more difficult to automate and is usually done 
interactively to define the topology and edit or post process the 
model. An output model based only on the measured points will 
usually consist of surface boundaries that are irregular and 
overlapping and need some assumption to be corrected using 
for example planes and plane intersections. For large structures 
and scenes, since the technique may require a large number of 
images, the creation of the model requires a significant human 
interaction regardless of the fact that image registration and a 
large number of 3D points were computed fully automatically. 
The degree of modeling automation increases when certain 
assumptions about the object, such as architectures, can be 
made. Since automated image-based methods rely on features 
that can be extracted from the scene, occlusions and un-textured 
surfaces are problematic. We often end up with areas with too 
many features that are not all needed for modeling, and areas 
without any or have minimum features that cannot produce a 
complete model. 
The most impressive results remain to be those achieved with 
interactive approaches. Rather than full automation, a hybrid 
easy to use system named Fagade has been developed [Debevec 
et al, 1996]. The method’s main goal is the realistic creation of 
3D models of architectures from small number of photographs. 
The basic geometric shape of a structure is first recovered using 
models of polyhedral elements. In this interactive step, the 
actual size of the elements and camera pose are captured 
assuming that the camera intrinsic parameters are known. The 
second step is an automated matching procedure, constrained by 
the now known basic model to add geometric details. The 
approach proved to be effective in creating geometrically 
accurate and realistic models. The drawback is the high level of 
interaction and the restrictions to certain shapes. Also since 
assumed shapes determine all 3D points and camera poses, the 
results are as accurate as the assumption that the structure 
elements match those shapes. Our method, although similar in 
philosophy, replaces basic shapes with a small number of seed 
points in multiple images to achieve more flexibility and levels 
of detail. In addition, the camera poses and 3D coordinates are 
determined without any assumption of the shapes but instead by 
a full bundle adjustment, with or without self-calibration 
depending on the given configuration. This achieves higher 
geometric accuracy independent from the shape of the object. 
The Fagade approach has inspired several research activities to 
automate it. Werner and Zisserman, 2002, proposed a fully 
automated Fagade-like approach. Instead of the basic shapes, 
the principal planes of the scene are created automatically to 
assemble a coarse model. These are three dominating directions 
that are assumed to be perpendicular to each other. Like Façade, 
the coarse model guides a more refined polyhedral model of 
details such as windows, doors, and wedge blocks. Since this is 
a fully automated approach, it requires feature detection and 
closely spaced images for the automatic matching and camera 
pose estimation using projective geometry. Dick et al, 2001, 
proposed another automated Fagade-like approach. It employs 
model-based recognition technique to extract high-level models 
in a single image then use their projection into other images for 
verification. The method requires parameterized building blocks 
with a priori distribution defined by the building style. The 
scene is modeled as a set of base planes corresponding to walls 
or roofs, each of which may contain offset 3D shapes that 
model common architecture elements such as windows and 
columns. Again, the full automation necessitates feature 
detection and projective geometry approach, however the 
technique used here also employs planner constraints and 
perpendicularity between planes to improve the matching 
process. Another approach [Tao et al, 2001] to improve the 
automatic matching and scene segmentation for modeling, after 
image registration, applies depth smoothness constraints on 
surfaces combined with color similarity constraints. 
The presence of noise, which result from extracting features 
from images, will make the choice of camera positions, or more 
precisely motion versus object distance, critical for correct 
construction. This has been studied widely in photogrammetry 
[e.g. Fraser, 1994]. It has been lately recognized in computer 
vision that photogrammetric bundle adjustment provides the 
optimum solution to image-based modeling [Triggs et al, 2000]. 
This has resulted in the inclusion of bundle adjustment 
following the sequential techniques as mentioned above. 
Critical analyses of automated techniques that use projective 
geometry were undertaken [Oliensis, 2000, Bougnoux, 1998]. 
Configurations that lead to ambiguous projective construction 
   
| 
  
     
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
   
    
   
  
    
ha 
cal 
cri 
pai 
Th 
aut 
est 
Frc 
poi 
at "AR AS hen hend ju humd 
It i 
onl 
tim 
OCC 
Oc 
sin 
con 
aut 
Ye 
app 
Sce 
con 
  
The 
exa; 
con 
are 
logi 
is € 
bets 
the 
[Tz
1
2
...
157
158
159
160
161
...
640
641
Full text: Close-range imaging, long-range vision

Access restriction

Copyright

Note to user