Photogrammetric computer vision: Papers accepted on the basis of peer-review full manuscripts

kalliany, r.; leberl, franz w.
ISPRS Commission III, Vol.34, Part 3A , Photogrammetric Computer Vision“, Graz, 2002 
  
Once the three masks are determined for each LNF image of a facade, 
the weight w” at pixel [i, 7] of LNF image 7 is computed by: 
W"i, 3] — Mii, j] Moi, 3] Mc [5, 7], (5) 
"nag. WD 
wi, jl = Swbi (6) 
2.2 Iterative Deblurring 
The CTF image thus obtained may look blurred (Figure 1(e)) because 
the LNF images may not be perfectly registered due to errors in cam- 
era parameters. (Note that our algorithm do not require precise input 
camera parameters; that is, any two versions of LNF images may not 
align accurately to each other.) A deblurring process is used that re- 
warps the source LNF images to align with the CTF image, similar 
to that of (Szeliski, 1996): 
[u, v, 1]T & P[w/, v^, 1], (7) 
which warps pixel [u' , v'] to [u, v] using P. Our goal is to find a warp 
P that best registers the two images. We use the following constraint 
functions in our method: 
Ecre — Y le(u,v)]*, (8) 
[e(u, v»)? 2 W"[w , v'](Yere[u, v] ^ Yruelv', v]. (9) 
Note that the overall weight mask W" is used, reflecting the de- 
gree of confidence we have for each pixel of Yyp. The Levenberg- 
Marquardt algorithm (Press et al., 1992) is employed to solve the 
constrained minimization problem. It is an iterative process (starting 
from the identity matrix); in each iteration, P is incremented by 
Ap — -(H 4 AI) !g, (10) 
where 
g — Y e(u,v)[de(u,v)/dp], (11) 
H =) [de(u,v)/0p][de(u,v)/Op]", (12) 
U,V 
in which p is a 8 x 1 vector representation of P (note only 8 param- 
eters are needed to describe P), and A is a parameter reduced to 0 as 
the procedure converges. After a new P is calculated, the LNF im- 
ages are rewarped and the weighted-average algorithm (Section 2.1) 
is rerun using the rewarped LNF images to compute for a new CTF 
image. Figure 1(f) shows such a CTF image with deblurring. 
The deblurring process is also executed in an recursive manner. Re- 
call that the correlation mask Mc is dependent on an initial CTF im- 
age. After deblurring, the new CTF image is used to compute a more 
accurate Mc, which then again updates the CTF image and triggers 
another round of deblurring. The convergence of the recursion is en- 
sured by stopping when the difference between two successive CTF 
images is sufficiently small. 
2.3 Experiments 
Experiments were carried out to test the consensus texture generation 
algorithm against an image dataset acquired at Technology Square, 
an office park of four buildings located on the MIT campus. About 
4,000 images were captured using the movable platform (Section 1) 
  
Figure 2: CTF textured model. 
at 81 nodes in this site. At each node, 47 images were acquired with 
distinct rotations. LNF images were extracted for each facade. 
Figure 1(f) shows the CTF result of the iterative weighted-average 
algorithm on a facade, for which 28 LNF images were extracted from 
the database and used to generate the CTF image. Most occlusions 
caused by modeled/unmodeled objects were satisfactorily removed; 
the luminance is also reasonably consistent across the entire CTF 
image. Figure 2 shows a perspective view of the resulting textured 
model of this site. 
Our experiments also show that only about a dozen original facade 
images, with quality shown in Figure 1(d), are needed for texture re- 
covery with a satisfactory result. In addition, the iterative deblurring 
is a very stable process; only a couple of iterations are necessary to 
reach the image quality as shown in Figure 1(f), under the condition 
of up to 5-pixel mis-alignment of LNF images due to input camera 
pose error. Therefore, the halting of the iterations can be simplified 
as to a certain number of iterations, instead of complex criteria. 
3 MICROSTRUCTURE DETECTION 
In the area of urban site reconstruction, a large body of research 
has been focusing on methods of establishing geometric models for 
large-scale structures, especially buildings, whose structural features 
(corners, edges, etc.) typically possess sufficient image cues to sup- 
port direct and reliable 3D reconstruction from the images (Firschein 
and Strat, 1996; Mayer, 1999). In this paper, we emphasize the im- 
portance of microstructures because they provide rich information of 
the buildings and result in added realism for visualization. 
Two pieces of evidence are used for microstructure extraction: the 
relative 3D depth of the structures and their 2D appearances. The 
relative depth of a surface microstructure is typically very small (see 
Section 4). Thus directly extracting these structures from noisy 3D 
depth data may be beyond the state-of-the-art of current computer 
vision algorithms without a priori knowledge. In this section, we use 
a 2D-based strategy to detect the locations of microstructures in the 
CTF images generated in Section 2. 
The CTF images provide a good texture representation of the facade, 
free from effects of occlusions and local illumination variations if 
enough views are provided. However, symbolic extraction of win- 
dows is still difficult due to the existence of noise. One type of noise 
is the global illumination variation on the facade. It happens in Fig- 
ure 2 that the lower part of the walls is universally darker than their 
upper part (sometimes even darker than upper windows). This is 
because lower parts of buildings typically receive less sunlight than 
upper parts in a densely urbanized area. The pixel-based weighted- 
average algorithm is unable to remove global illumination variations, 
because the lower part is darker on the majority of LNF images. A 
A - 383
1
2
...
396
397
398
399
400
...
456
457
Full text: Papers accepted on the basis of peer-review full manuscripts (Part A)

Access restriction

Copyright

Note to user