FEATURE EXTRACTION:
A NEURAL NETWORK ORIENTED APPROACH
Yong-Jian Zheng
Institute for Photogrammetry and Remote Sensing, University Karlsruhe
Englerstrasse 7, D-7500 Karlsruhe 1, Germany
Email: zheng@ipf.bau-verm.uni-karlsruhe.de
Abstract
Extracting features from digital images is the first
goal of almost all image understanding systems. It is
also difficult to solve because of the presence of noises
and various photometric anomalies. Another difficulty
for it is the fact that features and objects are reco-
gnized by using not only the information contained
in image data but also our a priori knowledge about
the semantics of the world. Thus, a feature extrac-
tion system should be robust to reduce the influence
of noises and flexible to integrate different levels of
knowledge for a wide range of data. In this paper,
a two-stage paradigm for feature extraction is pro-
posed, based on our conjectures about human vision
ability. It includes local feature grouping and new fea-
ture describing. Based on laws of perceptual grouping
and neural network modeling, we develop a novel ap-
proach for feature grouping, which finds the partion
of an image into so called feature-support regions. In
order to give abstract descriptions to these regions,
one needs a priori knowledge about their semantics
to construct models. So we also discuss model driven
methods for feature describing. To demonstrate our
approach, we present its application in the limited
domain of finding and describing straight lines in a
digital image. This approach can be extended to ex-
tract other more complex symbolic image events like
arcs, polylines, and polygons.
1 Introduction
Human vision is the ability to extract a variety of
features and cues from images and to draw inferences
from them. Realizing this ability is technically diffi-
cult even if we only focus our attention on the pro-
blem of feature extraction from digital images. One
difficulty is the fact that the physical transformati-
ons from objects to images are degenerated due to
a variety of confounding factors, including complex
uncontrolled lighting, highlights and shadows, tex-
ture, occlusion, complex 3D shapes, and digitization
effects. All of these make feature extraction to vary
in quite unreliable and unpredictable ways.
864
Besides, many features are only perceived owing to
the combination of weak evidence of several other fea-
tures. The evidence may be so weak that each feature,
if viewed in isolation, would be uninterpretable. Now
the difficulty is how to discover and group those fea-
tures at a lower level that may support the extraction
of a new feature with a higher level of abstraction.
To undo some degeneracies in images and make fea-
ture extraction robust, we need knowledge about
image events, objects in the world, and the imaging
process. We need context-information and a priori
models to guide feature searching and describing. The
difficulty, then, is when and how to integrate the re-
levant knowledge and the context-information during
feature extraction.
Due to these difficulties, many feature extraction me-
thods fail to find most relevant features. They may
find either too many or too few image events. They
may provide just a little information about extracted
features, namely feature properties, which may be re-
quired to support inference drawing. And, they may
tell us too few about the quality, namely the accu-
racy and the reliability, of extracted features which
is a very important information for top-down control
over the knowledge based interpretation process.
Feature extraction is a multi-level process of abstrac-
tion and representation. At the lowest level of abstrac-
tion, numerical arrays of direct sensory data are given,
including digital images and the results of other pro-
cesses which produce point/pixel data in register with
the sensory data. Now the goal is to discover image
events which may represent some symbolic-semantic
information and to describe them in a relevant way.
This is the first level of abstraction. At the second le-
vel of abstraction, the image events extracted earlier
are used as building elments to form new image events
and structures with more abstraction. This discovery-
description process can be repeated at a higher level
to hypothesize scene and object parts. There is no
doubt that two functions are required for building