Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
ISPRS Commission III, Vol.34, Part 3A „Photogrammetric Computer Vision‘, Graz, 2002 
  
  
; (c) 
Figure 5: Symbolic window extraction. (a) CTF image; (b) results of ORG; 
(c) results of PPF. 
4 DEPTH ESTIMATION 
The goal of depth estimation is to recover the relative depth of mi- 
crostructures on the facade surfaces. Accurate depth recovery from 
images is very difficult in the context of the City Scanning Project, 
mainly due to the small ratio of the depth of microstructures (on the 
order of centimeters) to the camera-to-wall distance (on the order of 
tens of meters). 
Stereo analysis (Faugeras, 1993) is widely used to recover depth 
from multiple images. There are some shortcomings in the pure 
form of stereo analysis when applied to the problem of urban site mi- 
crostructure extraction. First, the analysis (e.g. epipolar matching) 
typically takes place on two images whereas many more facade im- 
ages are available in our application. Second, geometric constraints 
(such as that microstructures are shallow structures on a largely pla- 
nar surface) is not easily incorporated into stereo processes. Third, 
other knowledge, such as that of occlusions (both modeled and un- 
modeled) is not readily applicable in stereo analysis. 
Fua and Leclerc developed a method that generates a 3D mesh to rep- 
resent the 3D structures on a surface (Fua and Leclerc, 1994). This 
approach has a number of advantages over the pure form of stereo 
analysis. It is correlation-based and makes use of any number of im- 
ages. It uses a minimization function in which geometric constraints 
can be added. In particular, the minimization process starts from a 
plane, addressing the planar constraint of the facades. 
In this section, surface microstructures are recovered using a hy- 
brid method that combines the 2D information obtained in Section 3 
and the 3D information obtained using a revised version of Fua and 
Leclerc's method. 
  
  
  
Facade | Actual Extracted Correct False neg. False pos. 
7 301 303 300 1 3 
17 18 17 17 1 
18 54 54 54 
19 18 11 6 12 5 
24 192 192 192 
25 82 72 72 10 
26 212 211 211 1 
27 72 72 72 
44 144 150 144 6 
45 53 51 51 2 
Total 1146 1133 1119 27 14 
  
  
Table 1: Window extraction results 
4.1 3D Mesh Generation 
The facade surface S is represented by a mesh, which is a hexag- 
onally connected set of vertices organized into triangular elements 
called facets. The algorithm starts with a planar surface and deforms 
it by iteratively minimizing objective function E(S): 
E(S) = ApEp(S) + AsEs(S) + AcEG(S), (13) 
AD + Às + AG = 1, (14) 
in which Ep is a planar surface constraint that controls the amount 
the surface is allowed to deviate from a plane, Es (S) is a correlation- 
based stereo constraint attempting to minimize the appearance differ- 
ences of each facet of the mesh across all the images, and EG (S) is 
a geometric constraint. Details for these components and the mini- 
mization scheme can be found elsewhere (Fua and Leclerc, 1994). 
In order to take advantage of knowledge obtained in Section 2, we 
modified Fua and Leclerc's algorithm by excluding occlusions from 
stereo computation in Es. We define an occlusion-removed facade 
image (or ORF image) by 
Yonr [i J] = Ying, 3] Mg Mc», 
where Mj is the environment mask that represents the modeled oc- 
clusion, and M. is a binary version of the correlation mask MC that 
represents the unmodeled occlusion. Thus we use Yo, rather than 
Yrwr to calculate Es, using only unoccluded pixels and focusing on 
the visible parts of each facade. 
We applied the 3D mesh generation algorithm to all the major fa- 
cades in our dataset. In the experiments, we set Ap — 0.1, Ag — 0.9, 
and Ag — 0; that is, the geometric constraint Eg (S) was not imple- 
mented. Figure 6 shows the depth estimation on one of the facades. 
These results are noisy, but the general pattern of windows is evident. 
4.2  2D/3D Combination 
Figure 6 shows that an accurate shape recovery of 3D microstruc- 
tures is nearly impossible from depth estimates alone, because the 
depth results are very noisy. For a higher-quality reconstruction, we 
assume that the facade surface structures can be approximated by 
two depth layers only: the wall layer and the window layer. This 
assumption is reasonable for normal walls. Non-flat portions on a 
wall are normally connections between windows and the wall; these 
detailed structures are beyond the scope of our current discussion. 
With this assumption, we use the 2D rectangles extracted in Section 3 
to represent the shape of the 3D microstructures, and use the average 
depth inside the rectangles to represent the depth of the structures. 
Figure 7 demonstrates an example of 2D/3D combination results. It 
shows that the 3D structures are well represented by combining the 
2D symbolic representation and the 3D depth information. 
A - 385
1
2
...
398
399
400
401
402
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user