Gruen, Armin
Figure 1. A case which cannot be solved automatically.
Left: Aerial image of Avenches, central town, right: 3-D model, derived semi-automatically
In this paper we will restrict ourselves to purely image-based approaches. Images inhibit a wealth of information which
is yet unmatched by other sensor products. Map scanning, laser scanning, radar and other more rare techniques will not
be covered. Furthermore we will focus on building extraction, because this has recently found most prominent attention.
There are already a number of useful reviews and paper collections on building extraction techniques available (IAPRS,
1996, Foerstner, Pluemer, 1997, Gruen et al., 1997, CVIU, 1998, Foerstner, 1999, Wang, Gruen, 2000a). This paper is
complementary, and of course will put some emphasis on our own group’s work.
2 METHODOLOGY IN OBJECT EXTRACTION
Object extraction consists of three steps: detection, reconstruction and attributation. Detection refers to the process of
finding a particular object in as many images as are used for further processing. Detection does not necessarily require
the knowledge of the object outline, but, as a minimum requirement, should be able to produce image windows
containing the outline. Reconstruction generates the 3-D geometric description of the object at the required resolution.
Attributation designates descriptive elements to the object, in case of a building the type of building (apartment house,
school, church, factory, etc.), a clear definition of parts of the building (chimney, dormer window, balcony, door,
window, etc.), or other required information.
This sequence detection-reconstrucion-attributation could even define a processing strategy and in fact is usually
equivalent to the processing sequence and represents very often a path towards increasing complexity.
At the detection level cues like color and DSM data have proven to be particularly valuable (Sibiryakov, 1996). They
are used to separate in a first step man-made objects from vegetation and other natural features and then to distinguish
buildings from other anthropogeneous objects, like roads, bridges, etc. Good success has been reported with isolated
houses. Complex urban structures, as they exist in European old towns, still widely resist this approach.
In reconstruction we encounter a great variety of methods, depending on the type of building, level of required detail,
number of images, kind of image cues and image primitives used, and utilized external and a priori information, level of
automation and operator interference. There is recently a clear trend towards the use of multiple (>2) images, color
cues, early transition to 3-D processing, and use of geometrical constraints. The complexity of buildings ranges from
cube-shaped flat roof boxes to very complex structures with even non-planar geometrical elements. The level of detail
is clearly application-dependent and goes from the joint representation of full building blocks through cube-type single
height approximations of individual buildings up to a very detailed modeling with all ridge, gable and eaves points and
maybe even the inclusion of chimneys, dormer windows and the like. Image cues may involve texture, color, shadows,
reflection properties. Image primitives include points, double- and triple-legged vertices, linear elements, homogeneous
regions. External information may come from additional sensors, DTMs, DSMs, scanned maps and GIS-resident data.
A priori information includes preknowledge about the building, its geometry and functionality (right angles,
parallel/straight lines, etc.), the sensor geometry (camera models), the sun position, and the like. The level of
automation in reconstruction extends from zero (complete operator measurements and structuring) to conceptionally full
automation.
“Model-based extraction” is very often referred to. This is not a very helpful terminology since any kind of information
extraction from images requires the use of some sort of model of the feature or object. There are different kind of
310 International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B5. Amsterdam 2000.