of Artificial Intelligence 1985) the idea of abstraction is defined in
the context of “search” as ‘to at first ignore the low-level details
of the problem, concentrating on the essential features, and then
fill in the details later" These examples show that the notion of
abstraction is not generally clear.
In this paper a special notion of abstraction is used. It is defined
in the context of image understanding where symbols are mapped
to portions of images. The description by means of the symbols
has to be structured. Additionally it has to be simplified, emphasis
has to be laid on important things, and others have to be neglected.
Abstraction is therefore defined in this paper as the increase of
the degree of simplification and emphasis.
As has been pointed out in the introduction, this also has some-
thing to do with parts which construct the substructure of an
object. Because, as Brachman (1979) states, the notion of a term
has to be defined to enable a sound reasoning, the part-of relation
is defined in this paper in terms of semantic networks following
(Niemann et al. 1990) or (Mayer 1994). A concept consists of
name, extension, and intension. The extension is the set of all
objects which belong to the concept. The intension comprises
all properties and relations an object needs to have to belong to a
concept. Two concepts are linked by the specialization relation
if the extension of one concept is a real subset of the extension
of the other concept. The specialization relation defines an order
among the concepts. More special concepts inherit the intension
of more general ones. The part-of relation means the construc-
tion of a concept from other concepts. Representations, like the
concept, or the specialization and the part-of relation, which are
independent of an application are called epistemological primi-
tives (Brachman 1979). In this paper other relations are used as
well. But note that it is useful to restrict an actual implementation
to the epistemological primitives, i.e. other relations have to be
transformed to that primitives.
Simplification and emphasis are important characteristics of
models (Rapp 1995) which are used to achieve the mapping of
symbols and image data. This means that abstraction is also an
implicit but integral part of models. And models are the critical
basis of image understanding. They can be considered as the
“theory” part of the theoretical framework of Marr (1982) as well
as the conceptual level of the levels of knowledge representation
of Brachman (1979). Explicit models have to be the foundation
for every project in image understanding, because they can give
reasons for deficits of an approach. Without an explicit model,
i.e. if a system is only based on heuristics, no sound analysis
of errors is possible and therefore the further development is
hampered. The typical development will start with constructing a
model from experience. The model is implemented and tested and
according to the arising problems the model is improved. This is
done iteratively.
2.2 Events in Scale-Space
[Images are analog representations, representing non- or subsym-
bolic information by means of a homomorphism: The represented
facts are contained in the representation. Relations between ob-
jects of the real world are transfered without loss of structure into
relations of the representation.
The things which can be seen in an image are dependent on the
scale (physical resolution). In a Landsat-TM image it is impossi-
ble to recognize a single human being on the ground whereas in
an aerial image of scale 1 : 4000 this is easy.
Recently tools have been created for handling the concept
scale in a formal manner. The main idea is the creation of a
multi-scale representation by a one-parameter family of derived
signals, where fine-scale information is successively suppressed
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
(Lindeberg 1994). Data is systematically simplified and finer-
scale details, i.e. high-frequency information is removed. The
scale parameter t € R4 is intended to describe the current level
of scale.
The representation at coarser scales are given by a convolu-
tion of the given signal with Gaussian kernels of successively
increasing width
L(z,t) = g(@,t) * f(x),
where g : RxR4+\{0} — Ris the (one dimensional) Gaussian
kernel
1 212/21
g(s. d) €
(a ATi
Another way to describe the evolution over scales is by means
of a solution to the (one-dimensional) diffusion equation
&L= Leur = Lagat
For the utilization of scale-space in discrete images a discrete
scale-space theory has been developed (Lindeberg 1994).
One question which arises is, if it is not enough to carry out
any kind of smoothing operation (e.g. mean). This is not the case
because one of the features of smoothing of utmost importance
is that in the transformation from the fine to the coarse scale no
artifacts should be introduced, i.e. no new accidental structure
should be created. Only the Gaussian kernel fulfills this criterion.
To describe the structure in an image, Lindeberg (1994) has
defined so-called blobs as the (zero order) scale-space features.
Blobs are closely linked to extrema in the image. They are smooth
regions which are brighter or darker than the background and stand
out from the surroundings.
In the process of smoothing the image there are four different
discrete events which can happen to a blob: annihilation, merge,
split, and creation. Whereas annihilation and creation are not too
likely to occur (examples are given in (Lindeberg 1994)), merge
and split of blobs are quite common. But blobs are only one means
to represent the information content of an image. More commonly
used representations are regions and edges (Haralick and Shapiro
1992). In a first approximation most of the events which can
happen to blobs will happen to regions or their delimiting edges
as well. In Figure 1 a) (see Figure 1 b) for thresholded versions
of the normalized image) the image is gradually smoothed (from
left to right; from top to bottom). The big region (upper left)
is split into two regions (lower left) and these two regions are
then merged again into one simple-shaped region (lower right).
Other situations can be slightly more complicated. Imagine a
“staircase” edge consisting of two edges connected by a small
plateau between them. Edge extraction will result in two edges
which are located close to each other. After smoothing only one
edge will remain. Taking all this into account the term scale-
space event is used for the remainder of this paper referring to
events of regions and edges.
That annihilation is unlikely to occur only holds for ideal im-
ages. Figure 2 (original image) shows a car on the road, The
image is gradually smoothed (from left to right; from top to bot-
tom) until the car cannot be recognized any more. The level of
smoothing where this happens depends on the level of noise in
the image as well as on the closeness to other objects. Linked to
this phenomenon are the inner scale and outer scale (Koenderink
1984). The outer scale is the (minimum) size of a window which
completely contains the object while the inner scale is the scale
at which substructures of an object begin to appear. For instance
the car on the road can only be seen in the images in the upper
half of Figure 2 (assuming good contrast inner scale corresponds
to approximately Im and outer scale to 4m resolution).
Figure
i= 21
Figure
from tc
One of
it appt
1994).
scale |
Onac
this is
exam
many
scale.
Ho
this is