XVIIIth Congress: XVIIIth Congress

kraus, karl; waldhäusl, peter

hin each of
. are social
are physi-
more than
., in image
objects, or
belonging
:stablished
t and spe-
t they are
ystem. For
; represent
ecause bar
le belongs
at circle is
e following
scribed by
nerical fea-
cording to
ns defining
scified and
lances con-
2 semantic
his process
. of a com-
parts as a
terms only
stantiation
rection. In
in instance
created. In
pressed by
Context-
tes are the
and modi-
1e opposite
inferences.
ion are not
npleted by
oe of corre-
defined by
e estimates
| algorithm
he network
17, 9].
s of knowl-
)pose a hy-
1antic net-
or attach
ie semantic
g the same
he different
| the semantic
ike of simplic-
network types is not defined at one fixed level of the
segmentation hierarchy, rather it is determined as
appropriate for the given task, knowledge base, or
the current state of the analysis process. Given such
a hybrid knowledge base, different options are avail-
able to recognize a modeled object in a model-driven
strategy. If a concept node is to be instantiated the
associated ANN can be activated and the object is
recognized in a fast and robust way without the ne-
cessity to detect the parts of the object as modeled
by the semantic network. If no ANN has been at-
tached to a concept node the analysis works in the
usual manner pursuing the decomposition hierarchy.
In this mode of operation the semantic network is
mainly used to control the analysis process and fo-
cus the various ANNs attached to the semantic net-
work on different image regions. If in a later phase
of the analysis process information about parts and
attributes of an object is required which was holis-
tically instantiated by an ANN then the knowledge
about the structure of objects modeled in the se-
mantic network can still be exploited. An example
for such a situation is the detection of gripping po-
sitions to guide a robot hand after the object has
been detected holistically by a neural network. In a
data driven analysis strategy the interaction works
in a similar way. After an object has been recog-
nized by an ANN the corresponding concept can be
instantiated even if its parts are not (yet) detected.
In a mixed strategy the instantiated objects recog-
nized by ANNs can be used to select appropriate
goal concepts from more abstract levels of the se-
mantic network. In this way the number of com-
peting interpretations is drastically reduced and the
analysis process can be restricted propagating the
constraints from the estimated goal concepts and
the instantiated objects.
As indicated above, it is not necessary to attach
an ANN to each concept of the semantic network.
Rather, one might choose to first train and associate
ANNs for objects that occur frequently or that are
difficult to recognize by a semantic network. In cases
when sufficient training data are not available for a
successful training of an ANN, no ANN is bound
to the corresponding concept. On the other hand,
the hybrid approach gives the option not to fully
decompose some of the objects alleviating the ef-
fort to acquire and adapt the knowledge base of the
semantic network.
Further extensions of the hybrid approach include
the utilization of neural networks to compute at-
tributes and judgments during analysis as well as
to learn control information to guide the analysis.
This gives more possibilities to exploit the learning
capabilities and robustness of neural networks for
semantic nets. Another option is to explore addi-
tional ways to adapt ANNs: As indicated above it

ity, however, we only refer to objects in the following.
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996
is usually not feasible to train an ANN for each ob-
ject to be expected in a complex scene. However, the
results of the analysis of an image sequence can be
used to adapt ANNs to objects occurring frequently
in the sequence.
4 Semantic Models for Object Recognition
The work described in the following is embedded in a
special research project studying advanced human-
machine communication. The machine should be
able to process acoustic and visual input and react
meaningfully by producing speech output or by ma-
nipulating objects in the environment of the com-
municating partners. The domain was chosen to be
the cooperative construction of a toy-airplane with
parts from a wooden construction-kit for children.
Object recognition and 3D-scene reconstruction are
necessary prerequisites for a robot to grasp parts in
a scene. Fig. 7 shows the main part of the hybrid
knowledge base solving these tasks.
Currently, the network consists of three levels of ab-
straction namely the image level (indicated by the
prefix I), the level of perception (indicated by the
prefix PE.), and the level of 3D-reconstruction (in-
dicated by the prefix RC.). The concept I.Focus
mainly allows to focus on certain areas in the im-
age to restrict the object recognition task. This
focus can be established by an utterance or a ges-
ture during the construction dialogue (not yet con-
sidered at the moment) or by the objects detected so
far. This concept has two context-dependent parts
namely I. REGION representing a color segmented re-
gion and I_OBJECT representing an object hypoth-
esis. According to our hybrid approach both con-
cepts are associated with a numerical classifier per-
forming a holistic instantiation of a colored region
or of an object, respectively. Region segmentation
is done by a polynomial classifier whereas object de-
tection is done by a special form of neural networks
called Local-Linear-Map (LLM) [23]. From a color
segmentation algorithm realized on a special hard-
ware platform the neural network gets blob centers
as ‘focus points’. At each focus point and based on
an edge enhanced intensity image a feature vector
is extracted by 16 Gabor filter kernels. This is the
input for the LLM-network calculating up to three
competing object hypotheses [8] For each competing
LLM-hypothesis an instance I. 0BJECTÜ) is created
which are stored in competing search tree nodes.
Dependent on the object type detected by the LLM-
network the corresponding concept in the percep-
tual level is selected to verify the object hypothe-
sis according to the structural knowledge stored in
the semantic network. That means if an instance
I_OBJECT“) with type ‘bolt’ exists then a modified
concept PE_BOLT(M) is created with the concrete
I.oBJECT), This link is inherited from the concept
PE_OBJECT. In the next step the control algorithm

1
2
...
738
739
740
741
742
...
1080
1081

Full text: XVIIIth Congress (Part B3)

Access restriction

Copyright

Note to user