ISPRS Commission III, Vol.34, Part 3A „Photogrammetric Computer Vision‘, Graz, 2002
igure
acade depth estimation.
5 DISCUSSION
We described a suite of algorithms for detailed urban environment
extraction, including texture recovery, microstructure detection, and
depth estimation. An iterative, weighted-average algorithm is ca-
pable of recovering a consensus texture map, nearly free from oc-
clusions (modeled and unmodeled) and local illumination variations.
2D microstructures are extracted using an algorithm that combines
bottom-up and top-down strategies. Finally, the relative depth of 3D
microstructures are estimated, facilitating a combination of 2D and
3D information for a complete representation of surface microstruc-
tures. Combining with previous techniques, these algorithms are ca-
pable of producing realistic, detailed, and texture-mapped 3D models
of urban environments from large sets of real-world images.
The proposed algorithms are effective for solving a generic set of ur-
ban environment extraction and refinement problems, in which the
wall surfaces are largely planar and the microstructures are mainly
rectangular. Many buildings in urban environments satisfy these con-
straints. In addition, practicality is one of the design emphases of the
algorithms. For example, significant efforts have been invested in
the algorithms to deal with inaccuracy and uncertainty of the input
data. The texture deblurring process allows the algorithm to toler-
ate camera pose error that often arises in real applications. The 2D
microstructure module is adapted to extract structures of any size,
requiring from the user only the upper/lower bounds of the structure
size and needing no interactive parameter adjustment.
There are several directions in which the algorithms can be extended
to solve more general problems. First, the extracted 2D microstruc-
tures can provide partial geometric constraints in Eg(S) for depth
estimation. How to improve the depth estimation by incorporating
the partial constraints is a topic for future study.
Second, the architecture of iterative texture recovery invites more
information to be utilized for better results. For example, once the
depth of the 3D microstructures are determined, occlusions caused
by these structures on the facade can be computed for each LNF
image. Therefore, the texture recovery algorithm can be rerun to take
into account this additional information (excluding these occlusions
from the CTF computation).
Third, the ORG algorithm is designed to extract a generic class of ob-
jects. Although a large variety of surface microstructures fit into this
class, it has two major limitations: the shape of each microstructure
is approximated by a rectangle, and the luminance of the microstruc-
ture must be relatively uniform. For more special problems, special
object detection modules should be used as a successor of ORG/PPF.
Fourth, the global illumination variation problem has not been solved
in the CTF algorithm. For rendering purposes, a better texture repre-
sentation may be demanded. This problem could be solved using the
heuristics given by the periodic pattern of microstructures. As the
microstructures share the common shape and common period, they
should also share the illumination in normal cases. An illumination
adjustment algorithm could thus be designed for this end.
Ri X ; sd A à
Figure 7: Microstructure visualization.
ACKNOWLEDGMENTS
This work was funded in part by DARPA DACA76-97-K-0005,
ARPA/ARL DAAL02-91-K-0047, ARPA/ATEC DACA76-92-C-
0041, and ARO/ARL DAAG55-97-1-0026. The authors would like
to thank Eric Amram and Neel Master for their technical support. Xi-
aoguang Wang's contributions to the paper reflect his work at UMass
Amherst and MIT.
References
Antone, M., and S. Teller, 2000. “Automatic Recovery of Relative Camera
Rotations for Urban Scenes,” IEEE Computer Society Conference on Com-
puter Vision and Pattern Recognition, pp. 282-289.
Bertalmio. M., G. Sapiro, V. Caselles, and C. Ballester, 2000. “Image In-
painting,” SIGGRAPH’ 00.
Bosse, M., D. de Couto, and S. Teller, 2000. “Eyes of Argus: Georeferenced
Imagery in Urban Environments,” GPS World, pp. 20-30.
Collins, R., C. Jaynes, Y. Cheng, X. Wang, F. Stolle, A. Hanson, and E. Rise-
man, 1998. “The Ascender System for Automated Site Modeling from Mul-
tiple Aerial Images,” Computer Vision and Image Understanding, vol. 72,
no. 2, pp. 143-162.
Coorg, S., and S. Teller, 1999. “Extracting Textured Vertical Facades ;From
Controlled Close-Range Imagery,” IEEE Computer Society Conference on
Computer Vision and Pattern Recognition, pp. 625-632.
Debevec, P., C. Taylor, and J. Malik, 1996. “Modeling and Rendering Archi-
tecture from Photographs: a Hybrid Geometry and Image-based Approach.”
SIGGRAPH 96, pp. 11-20.
Faugeras, O., 1993. Three-Dimensional Computer Vision, a Geometric View-
point, The MIT Press.
Firschein, O., and T. Strat (Ed.), 1996. RADIUS: Image Understanding for
Imagery Intelligence, Morgan Kaufmann Publishers, San Francisco, CA.
Foley, J., A. van Dam, S. Feiner, and J. Hughes, 1990. Computer Graphics,
Principles and Practice, Second Edition, Addison Wesley, Reading, MA.
Fua, P, and Y. Leclerc, 1994. “Using 3-Dimensional Meshes to Combine
Image-Based and Geometry-Based Constraints,” European Conference on
Computer Vision, pp. B:281-291.
Mayer, H., 1999. "Automatic Object Extraction from Aerial Imagery — A
Survey Focusing on Buildings," Computer Vision and Image Understanding,
vol. 74, no. 2, pp. 138-149.
Press, W., S. Teukolsky, W. Vetterling, and B, Flannery, 1992. Numerical
Recipes in C, The Art of Scientific Computing, Cambridge University Press.
Sato, Y., M. Wheeler, and K. Ikeuchi, 1997. “Object shape and Reflectance
Modeling from Observation,” SIGGRAPH'97, pp. 379-387.
Szeliski, R., 1996. “Video Mosaics for Virtual Environments,” IEEE Com-
puter Graphics and Applications, vol. 16, no. 2, pp. 22-30.
Teller, S., 1998. “Automated Urban Model Acquisition: Project Rationale
and Status,” Image Understanding Workshop, pp. 455-462, Monterey, CA.
Wang, X., and A. Hanson, 2001. “Surface Texture and Microstructure Ex-
traction from Multiple Aerial Images,” Computer Vision and Image Under-
standing, vol. 83, no. 1, pp. 1-37.
A - 386