Full text: XVIIIth Congress (Part B3)

       
   
     
   
     
   
    
   
   
   
    
   
   
   
  
  
   
   
   
   
    
  
   
   
   
     
   
   
   
     
   
   
   
   
   
   
   
   
   
   
   
   
     
    
    
    
      
   
     
     
   
    
hin each of 
. are social 
are physi- 
more than 
., in image 
objects, or 
belonging 
:stablished 
t and spe- 
t they are 
ystem. For 
; represent 
ecause bar 
le belongs 
at circle is 
e following 
scribed by 
nerical fea- 
cording to 
ns defining 
scified and 
lances con- 
2 semantic 
his process 
. of a com- 
parts as a 
terms only 
stantiation 
rection. In 
in instance 
created. In 
pressed by 
Context- 
tes are the 
and modi- 
1e opposite 
inferences. 
ion are not 
npleted by 
oe of corre- 
defined by 
e estimates 
| algorithm 
he network 
17, 9]. 
s of knowl- 
)pose a hy- 
1antic net- 
or attach 
ie semantic 
g the same 
he different 
| the semantic 
ike of simplic- 
network types is not defined at one fixed level of the 
segmentation hierarchy, rather it is determined as 
appropriate for the given task, knowledge base, or 
the current state of the analysis process. Given such 
a hybrid knowledge base, different options are avail- 
able to recognize a modeled object in a model-driven 
strategy. If a concept node is to be instantiated the 
associated ANN can be activated and the object is 
recognized in a fast and robust way without the ne- 
cessity to detect the parts of the object as modeled 
by the semantic network. If no ANN has been at- 
tached to a concept node the analysis works in the 
usual manner pursuing the decomposition hierarchy. 
In this mode of operation the semantic network is 
mainly used to control the analysis process and fo- 
cus the various ANNs attached to the semantic net- 
work on different image regions. If in a later phase 
of the analysis process information about parts and 
attributes of an object is required which was holis- 
tically instantiated by an ANN then the knowledge 
about the structure of objects modeled in the se- 
mantic network can still be exploited. An example 
for such a situation is the detection of gripping po- 
sitions to guide a robot hand after the object has 
been detected holistically by a neural network. In a 
data driven analysis strategy the interaction works 
in a similar way. After an object has been recog- 
nized by an ANN the corresponding concept can be 
instantiated even if its parts are not (yet) detected. 
In a mixed strategy the instantiated objects recog- 
nized by ANNs can be used to select appropriate 
goal concepts from more abstract levels of the se- 
mantic network. In this way the number of com- 
peting interpretations is drastically reduced and the 
analysis process can be restricted propagating the 
constraints from the estimated goal concepts and 
the instantiated objects. 
As indicated above, it is not necessary to attach 
an ANN to each concept of the semantic network. 
Rather, one might choose to first train and associate 
ANNs for objects that occur frequently or that are 
difficult to recognize by a semantic network. In cases 
when sufficient training data are not available for a 
successful training of an ANN, no ANN is bound 
to the corresponding concept. On the other hand, 
the hybrid approach gives the option not to fully 
decompose some of the objects alleviating the ef- 
fort to acquire and adapt the knowledge base of the 
semantic network. 
Further extensions of the hybrid approach include 
the utilization of neural networks to compute at- 
tributes and judgments during analysis as well as 
to learn control information to guide the analysis. 
This gives more possibilities to exploit the learning 
capabilities and robustness of neural networks for 
semantic nets. Another option is to explore addi- 
tional ways to adapt ANNs: As indicated above it 
  
ity, however, we only refer to objects in the following. 
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996 
is usually not feasible to train an ANN for each ob- 
ject to be expected in a complex scene. However, the 
results of the analysis of an image sequence can be 
used to adapt ANNs to objects occurring frequently 
in the sequence. 
4 Semantic Models for Object Recognition 
The work described in the following is embedded in a 
special research project studying advanced human- 
machine communication. The machine should be 
able to process acoustic and visual input and react 
meaningfully by producing speech output or by ma- 
nipulating objects in the environment of the com- 
municating partners. The domain was chosen to be 
the cooperative construction of a toy-airplane with 
parts from a wooden construction-kit for children. 
Object recognition and 3D-scene reconstruction are 
necessary prerequisites for a robot to grasp parts in 
a scene. Fig. 7 shows the main part of the hybrid 
knowledge base solving these tasks. 
Currently, the network consists of three levels of ab- 
straction namely the image level (indicated by the 
prefix I), the level of perception (indicated by the 
prefix PE.), and the level of 3D-reconstruction (in- 
dicated by the prefix RC.). The concept I.Focus 
mainly allows to focus on certain areas in the im- 
age to restrict the object recognition task. This 
focus can be established by an utterance or a ges- 
ture during the construction dialogue (not yet con- 
sidered at the moment) or by the objects detected so 
far. This concept has two context-dependent parts 
namely I. REGION representing a color segmented re- 
gion and I_OBJECT representing an object hypoth- 
esis. According to our hybrid approach both con- 
cepts are associated with a numerical classifier per- 
forming a holistic instantiation of a colored region 
or of an object, respectively. Region segmentation 
is done by a polynomial classifier whereas object de- 
tection is done by a special form of neural networks 
called Local-Linear-Map (LLM) [23]. From a color 
segmentation algorithm realized on a special hard- 
ware platform the neural network gets blob centers 
as ‘focus points’. At each focus point and based on 
an edge enhanced intensity image a feature vector 
is extracted by 16 Gabor filter kernels. This is the 
input for the LLM-network calculating up to three 
competing object hypotheses [8] For each competing 
LLM-hypothesis an instance I. 0BJECTÜ) is created 
which are stored in competing search tree nodes. 
Dependent on the object type detected by the LLM- 
network the corresponding concept in the percep- 
tual level is selected to verify the object hypothe- 
sis according to the structural knowledge stored in 
the semantic network. That means if an instance 
I_OBJECT“) with type ‘bolt’ exists then a modified 
concept PE_BOLT(M) is created with the concrete 
I.oBJECT), This link is inherited from the concept 
PE_OBJECT. In the next step the control algorithm
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.