International Archives of Photogramme try and Remote Sensing, Vol. 32, Part 7-4-3 W6, Valladolid, Spain, 3-4 June, 1999
same object (implying also same date, if the object or its
properties vary in time) are usually needed, map updating and
change detection applications require the opposite. Note that it
is not always necessary to perform the respective co-registration
operations e.g. to relate two images to each other, they do not
have to be geocoded, but the mathematical relations to
transform from one to the other should be known.
Same abstraction level. If this prerequisite is not fulfilled, a
direct comparison becomes impossible, e.g. when comparing
road centerlines with road information in images.
Clear object definitions. This seems trivial, but in practice it is
not, as it depends on definitions made by the data producer or
user, and it may vary depending on application and country. As
an example, terrain information on bridges is considered to be
part of the DTM in Germany but in Switzerland not. Another
example from a project at ETH Zurich to update the road
network of the 1:25,000 maps is the definition of road
centerlines. This is not necessarily the middle line strip on a
two-direction road, it may include tram lines and dedicated
bicycle corridors or not, while a widening of the road before
intersections by additional lanes to turn right and left should
generally not be included in the definition of the road width but
in some cases might be needed.
Need of metadata and quality indicators. Information on the
data itself and how they were generated are clearly needed.
Unfortunately, often data are delivered without this information
which is small in size but high in importance. This has to do
among other with weaknesses in data storage, management and
transfer, and lack of interoperability among various systems.
Quality indicators are the only way to decide on how to
combine and weigh different components. This information is
often not provided, or only very global measures, e.g. for the
DTM of a whole map sheet, a single RMS error is provided, and
for a classification map, an accuracy percentage for all classes
or maybe each individual class for the whole area. For a
successful integration accuracy indicators for each data unit is
needed, e.g. each node of a DTM, or each class object (or even
better each pixel) of a classification map. In addition,
appropriate theories and tools for the interpretation, evaluation
and fusion of multiple partial results are needed.
Regarding the above prerequisites, some remarks will be made:
• The completeness and accuracy of the data to be combined
will almost always differ. Generally, GIS data are expected to
be more abstract.
• The differences between the data should be minimised right
from the beginning. As an example, road intersections are
often used as GCPs with airborne and spacebome imagery.
Instead of using vector information about road centerlines to
detect them in the images, image chips of such intersections
coming from similar imagery could be easier detected and
localised in the images to be processed.
• Deep knowledge is needed about advantages and
disadvantages of available data, in order to select the
appropriate one, for a given application.
5. KNOWLEDGE-BASED IMAGE ANALYSIS
COMPONENTS AND ARCHITECTURE
We assume that in general a 3D description of a scene (site) is
aimed at. The scene consists of objects. Each object has
characteristics, properties, features, attributes (all these four
words are treated here as synonyms). The term structure has
been used to denote combinations of features (used now in the
sense of object components) or of objects, e.g. the combination
of edge segments might lead to the structure "closed contour",
or the combination of buildings to the structure "block". The
attributes of the objects can be very variable: geometric,
spectral, textural, material, physical, chemical, biological,
functional, temporal etc. To describe the scene raw (or derived)
measurements are used. These measurements have a reference
system (pixels, grid cells etc.) and provide information about
some limited properties of the scene, either explicitly or
implicitly, e.g. the high areal concentration of lights in night-
satellite imagery may be an indication of urban areas.
Furthermore, relations between objects and features exist
(topology, context) which should be modelled and appropriately
exploited. A priori information can exist in the form of rules
(very soft to very strict), and models (e.g. roof models) or other
knowledge. This a priori information encodes assumptions,
constraints etc. and may relate to features or objects or the
whole scene. Models, and their associated assumptions and
range of validity, are needed in various other aspects, e.g.
sensor models, image and noise models, terrain models,
atmospheric, illumination and reflectance models etc.
Finally, important components of such an image analysis system
are the knowledge modelling and representation, the system
architecture and control (hierarchical, e.g. top-down or bottom-
up, heterarchical, e.g. blackboard architecture) and the strategy
to solve a given problem. Critical questions, which should be
answered by the above components, are:
• Which data, knowledge and processing units should be
combined, when and how?
• How should the processing flow be?
• How are the partial results combined?
• How much human interaction is needed and when?
5.1. Knowledge, Modelling and Representation for Data
Fusion
There are various theories and approaches for knowledge,
modelling and representation in image analysis and different
system architectures. A good overview, although a bit old, is
given by Abidi and Gonzalez (1992). Some of the major
approaches include:
• Bayesian approaches (Miltonberger et al., 1988; Quint and
Landes, 1996)
• Mathematical approaches: least squares, Kalman filtering,
robust estimation, régularisation
• Dempster-Shafer / belief (evidence) theory (Dempster, 1968;
Shafer, 1976)
• Frames (Hanson and Riseman, 1978)
• Ruled-based systems (McKeown et al., 1985; McKeown and
Harvey, 1987)