STRUCTURAL 3D-ANALYSIS OF URBAN SCENES FROM AERIAL IMAGES
U. Stilla, K. Jurkiewicz
Forschungsinstitut für Informationsverarbeitung und Mustererkennung (FGAN-FIM)
Eisenstockstr. 12, 76275 Ettlingen, GERMANY
E-mail: usti@gate.fim.fgan.de
ISPRS Commission Ill, Working Group 3
KEY WORDS:
Recognition, Urban, Aerial Image, Three-dimensional Structure Analysis, Production System, Blackboard System
ABSTRACT:
This paper presents a model-based method for the automatic analysis of structures in aerial images. The model of
the objects to be recognized is described in the form of a production net. The production net represents a hier-
archical organisation of subconcepts and production rules. The production rules are implemented in a blackboard
architecture as knowledge sources. The database of the blackboard system is stored in an associative memory.
The recognition of spatial objects from an image sequence is illustrated by an example of a simple model for the
geometric 3D-reconstruction of a roof. An ISPRS test dataset was used in order to evaluate the efficiency of the
analysis system.
1 INTRODUCTION
The automatic interpretation of aerial images by
knowledge-based systems has been a subject of re-
search for some time [Nagao & Matsuyama, 1980],
[McKeown et al., 1985], [Nicolin & Gabler, 1987].
The research activities in the field of object recogni-
tion have received a special impulse from the increased
demand of urban scene description. This can be par-
ticularly attributed to the development of geographi-
cal information systems. But also the availability of
additional information by digital maps, spatial image
sequences, and distance data has given new impulses
to research. Especially with respect to the recogni-
tion of man-made objects from large scale images 3D-
reconstruction has increasingly gained importance.
In this paper we describe a system for image interpre-
tation and a method for 3D-recognition. The work is
part of a research project for map-aided image analysis
with two- and three-dimensional models [Stilla et al.,
1995b]. The title of this project! is: Analysis of aerial
and satellite images for automatic determination of
the ground sealing of urban areas.
In the field of automatic object recognition, knowledge
based methods are increasingly used for the analysis
and description of aerial imagery. Of particular interest
are structure oriented hierarchical methods, which build
up structure hierarchies by composing complex struc-
tures from less complex structures. Using this approach
! This project is funded by the Deutsche Forschungsgemein-
schaft (DFG) and is carried out in cooperation with the Insti-
tut für Photogrammetrie und Fernerkundung (IPF), Univer-
sity of Karlsruhe
832
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
the analysis proceeds step by step, with constant refer-
ence to the patterns being analyzed, producing interim
results of increasing degrees of abstraction. Hence fol-
lowing Marr [Marr, 1980], the process of visual recogni-
tion is interpreted as the active construction of a sym-
bolic scene description from images.
2 ANALYSIS STRATEGY
The subject of investigation is the recognition of three-
dimensional objects from only two images (Fig 1). It
is presupposed that the formulas of projection of points
of the scene into the images are known, which is es-
sential for stereotriangulation. In contrast to other re-
search approaches there is no need for epipolar geome-
try.
In order to test the efficiency of the approach we
abstained from including external information such as
height information [Haala & Hahn, 1995], digital map
data [Quint & Sties, 1995][Stilla, 1995], etc. Due
to the lack of external information for establishing hy-
potheses (e.g. number, position, and orientation of ob-
jects) the analysis is carried out bottom up. As on
lower abstraction levels it can often not be decided
whether an object is part of a target object, alternative
interpretations (competing interim results) are pursued
in parallel and independently.
The analysis is carried out symbolically by regarding
the primitive objects as elements of a set, i.e. prepro-
cessing results involving topological relations between
primitive objects are not used. Neighbouring objects
in the image are not necessarily neighbouring in space.
The actual recognition takes place on the scene level
(in space) instead of the sensor level (in image).