PARSING SEGMENTED DIGITAL IMAGES
John Stokes, Dr
Department of Photogrammetry
Royal Institute of Technology
S-100 44 Stockholm, Sweden
E-mail: john@fmi.kth.se
ISPRS Commission III
ABSTRACT
Computer interpretation of images requires a decision
on what to look for and a strategy exploiting this
knowledge. A generic model for which objects make up
the image is designed, assuming low altitude aerial
images. The generic model is exploited first to choose a
representation of the image suitable for parsing,
secondly for defining a parsing procedure. As
representation, a segmentation is chosen consisting of
segment boundaries plus descriptions of segment
interiors. Segment boundaries are only of the kind
recognized by the parsers, in this case straight line and
smooth curve segments represented using strip trees.
Secondly, this segmentation is used as an input to a set
of parsers which use a list of properties of buildings in
order to interpret the input. Both line parsers and region
parsers are used. Each parser is successful for a limited
task. A set of parsers is scanned until the parse is
accepted in a consistency test. This strategy is chosen
as it is considered easier to test if a parse is acceptable
than to design a well performing general parser.
Key words: Image interpretation, image segmentation,
parsing, knowledge based interpretation.
1 INTRODUCTION
The production of large scale maps using aerial
images is one of the more important applications of
photogrammetry. Well established procedures for the
measurement of points in images and the estimation
of their location in object space have existed for a long
time. Also, the possibility of using computers when
carrying through the necessary photogrammetric
procedures has introduced algorithms like e.g. bundle
adjustment, which have brought the bulk of
photogrammetric know-how to a high degree of
completion. There is, however, a very important
exception: Working with digital images, all
photogrammetric work today rests on a continuous
interaction between the computer and an operator,
who is responsible for everything having to do with
interpretation. The possibility of using computers for
automating digital image interpretation has not yet
given any algorithms used in a computer production
line. (Fórstner, 1989) calls the problem "the stepchild
of photogrammetric research", indicating an
extraordinary low interest among photogrammetrists
for the problem. After the paper by (Fua and Hanson,
1988) and the advent of the decision procedures
designed by (Wallace, 1978) and (Rissanen, 1984), the
problem has, however, been approached by several
research-workers, e.g. (McKeown, 1991), (Herman et
al, 1984) and (Fórstner 1988).
727
The success of an automated procedure for locating
and describing objects in digital images rests on a
rational procedure for the interpretation of image
contents. Several subtasks can be identified which
have to be performed in such a procedure: the image
must be preprocessed in order to obtain an input
suitable for parsing, the object types must be defined
using models in such a way that these models can be
explicitly used for parsing, the parsing should be
carried through qualitatively without side views on
statistical bias, errors and contradictions must be taken
care of, etc. Several methods to parse man made
objects have been developed, this kind of objects
having so much internal structure that a generic
model for them easily can be conceived. (Walz, 1972)
devised a parser for line drawings, a method which,
while intuitively appealing, strongly rests on the
correct identification of boundary lines. It is therefore
very sensitive to errors due to missing lines and is
probably not an appropriate method for parsing
segmentations of grey level images. (Dickinson et al,
1990) use the idea of aspects and perform the parse at a
high description level, where errors are comparatively
simple to trace. Also this method is sensitive to
missing lines. The region segment parser suggested
below is however directly inspired by this approach.
The work presented here comprises a design of a
procedure for automated interpretation of digital low-
altitude aerial images. Section 2 gives an overall
presentation of a procedure for interpretation. Section
3 discusses the problem of obtaining image represen-
tations suitable for parsing. In section 4, the introduc-
tion of image features into 3-D object space using fully
oriented 2-D images is discussed. Section 5 presents
line and region parsers followed by an example in
section 6 and a closing discussion in section 7.
2 OVERALL STRATEGY
When interpreting a grey level image, the
representation of the image usually has no connection
to the expected information contained in the image. A
computer-based interpretation will in these cases start
in quite an arbitrary way. If, instead, an image
representation is based on the relation to object types
expected in the image, interpretation, i.e. parsing the
representation, can be made simpler. The image is
then represented in terms of the geometrical contents,
ie. primitives, relevant for objects assumed to be
present. A suitable representation of aerial images
used for mapping purposes is a segmentation descri-
bing grey level discontinuities in the form of straight
line and curve segments as well as segment interiors.
Having an image representation in terms of a limited
set of primitives, the parsing amounts to a listing of
those properties in the image that have a unique