Fit-
ated
381
ial,
)re-
In-
nts
Tic
di-
rie
ed
er,
UTILIZING HIGH-LEVEL KNOWLEDGE
IN MIDDLE-LEVEL IMAGE ANALYSIS
Yaonan Zhang
Faculty of Geodesy, Delft University of Technology
Thijsseweg 11, 2629 JA Delft, The Netherlands
Commission III
ABSTRACT:
This article proposes a unified architecture for image analysis, which enable us to: 1) integrate high-level
knowledge in image segmentation; 2) use structural information for stereo matching in image/object dual
spaces; 3) integrate image segmentation with stereo matching; 4) combine the edge-based and region-based
segmentation. In order to design an integrated image analysis system, we must solve the theoretic problem
on how to combine the knowledge from different sources (e.g. image intensity, object shape, structural
information), as well as the implementation problem (e.g. how many layers or modules should be used, how
to negotiate the objectives with each module and how to control the system, etc.). Some of the questions have
been answered in this paper and some proposals have been made to solve the remaining problems.
KEYWORDS : Image Analysis, Image Processing, Machine Vision.
1. INTRODUCTION the system to make inferences about the scene that
go beyond what is explicitly available from the
The tasks of computational vision often rely on image. By providing this link between perception
solving the following problem: from one to more and high-level knowledge of the components of the
images of a scene, derive an accurate geometric scene, model-based recognition is an essential
description of the scene and quantitatively recover component of most potential application of vision.
the properties of the scene that are relevant to the
given task. This problem (referred as middle-level This paper proposes an integrated architecture for
image analysis) is hard because of several reasons image analysis and addresses the problems on how
[Aloimonos]: to integrate the information from different sources to
improve the performance of objectives associated
1). During the image formation process the with middle-level analysis. The scheme presented in
three-dimensional world is mapped into two this paper is the combination and extension of the
dimensions, and one dimension is lost. This work described in author's other papers
create many problems when we try to solve [Zhang 91a,91b,92a,92b], which mainly focus on the
the inverse (ill-posed) problem of recovering image segmentation and stereo matching. The
the world from the image. paradigm in this paper would allow us to integrate
2) Even well posed (or regularized) visual a variety sources of knowledge and different kind of
computations are often numerically unstable, techniques. Under such scheme, we want to carry
if noise is present in both the scene and the out the following integrations:
image. As a result, many problems which
theoretically have unique solutions become 1) integrate high-level knowledge into image
very unstable in the presence of input noise. segmentation.
3) Visual objects are hard to define. Object 2) use structural information for stereo
modelling techniques have been developed in matching.
the artificial intelligence and computer 3) integrate image segmentation and stereo
graphics to represent the 3-D objects, but it is matching.
still very difficult to use these techniques to 4) combine the edge-based and region-based
describe a variety of natural objects. segmentation.
Model-based vision allows high-level knowledge of We first in section 2 present an architecture on the
the shape and appearance of specific objects to be integrated image analysis. In section 3 we examine
used during the process of visual interpretation. several principles or criterions from probability and
Reliable identification can be made by identifying information theory on the possibility as a unified
consistent partial matches between the models and measurement to combine the information from
features extracted from the image, thereby allowing different sources. Finally we in section 4 point out
599