XXII ISPRS Congress 2012: Technical Commission III

boundaries searching for structures that satisfy regularity and 
symmetry rules. In addition to that, they extract three- 
dimensional models of windows by searching for image 
features. Teboul et al. (2010) used shape grammars towards 
fixed tree representations which are able to capture a wide 
variety of building topologies for detailed facade segmentation. 
They obtained very high performance even for buildings which 
are partially occluded or which appear under different 
illumination conditions. Ripperda (2008) reversible jump 
Markov chain Monte Carlo (rjMCMC) for the estimation of 
optimal parameters for the windows and uses a formal grammar 
to describe their behaviour. Mayer & Reznik (2006) propose 
combination of Markov Chain Monte Carlo with information 
from Implicit Shape Models and with Plane Sweeping as well. 
Tanks to this they achieve a 3D interpretation of building 
facades determining windows and their 3D extent. In Mayer & 
Reznik (2008) the method is extended with self-diagnosis and 
model selection to choose the most appropriate model for the 
configuration of windows in terms of rows or columns. 
Most of these algorithms are computationally expensive and not 
suitable for real time applications. Sirmacek (201 1a) proposed a 
segmentation and graph theory based facade classification 
method with emphasis on real-time requirements. However, this 
method requires very uniform and also correctly ortho-rectified 
color images as input. 
In Europe there has been a joint research effort on fagade 
classification called eTRIMS (Foerstner et al., 2009). A special 
role plays the syntactic formulation of Gestalt laws. Inside 
eTRIMS such approach has been formulated in (Tylecek &Sara, 
2011) using stochastic grammars for the description, and 
random sampling as search method. 
A similar formulation uses production systems as declarative 
knowledge representation and special interpreters for search. 
This has the advantage of clear modularity and explicit 
declarative inclusion or exclusion of particular constraints or 
recursive principles. Thus comparison of their benefits or cons 
is facilitated. E.g. Matsuyama & Hwang (1990) have proposed 
the SIGMA system for automatic understanding of aerial images 
of man-made objects. This system featured declarative 
knowledge coding using production rules and a special 
interpretation scheme quite similar to the one used here. 
Unfortunately this work has not been continued. Another such 
approach was called BPI (Stilla & Michaelsen, 1997) and this is 
being continued as the GESTALT system (Michaelsen et al., 
2010). 
In Sirmacek (201 1b) the usage of L-shaped feature primitives is 
proposed for window and door detection from thermal facade 
images. Iwaszczuk et al. (2011) suggest using local dynamic 
threshold and masked correlation for corner detection and 
orders detected window candidates into row and columns. In 
this paper we would like to merge the idea of detecting 
primitives (corners and L-shapes) with gestalt rules to find 
windows from thermal images robustly. 
This contribution is organized as follows: Section 2 presents 
production systems in general and two special systems are 
presented coding the likely organization of windows on facades. 
Section 3 comparatively studies the behaviour of these systems 
on example data obtained in the city of Munich. The work 
closes with a discussion on the results and an outlook for future 
work in Section 4. 
2. PRODUCTION SYSTEMS 
Structural knowledge e.g. about the part-of hierarchies of man- 
made objects, about geometric properties of their mutual 
    
  
   
  
  
  
  
  
    
    
    
   
    
   
  
   
   
   
     
     
   
   
   
    
     
   
    
   
    
    
   
  
      
  
    
   
    
    
  
  
   
    
    
    
    
arrangements, and about their appearance can be coded in a 
declarative way using systems of production rules. 
2.1 Extraction of Primitives 
Prerequisite to all syntactic work on images is segmentation for 
primitive objects. Here a corner detector based on a masked 
correlation which consists of “on” and “off” fields and of “don’t 
care” areas is applied. Masked correlation was originally 
applied by (Stilla, 1993) to recognise stamped characters. The 
advantage of this method is, that can cope with blurred edges. 
We adapt the idea of masked correlation to search for the 
changes in the intensity between facade and window. We place 
a “don’t care” area between “on” and “off” fields, which helps 
to avoid blurring on the edges. The correlation coefficient c is 
calculated using 
1 
2 2 
m ag. m g 
Se a TT stg] ti 
m9 I m NT. 9. 
a-sgn(p.-p):se(3.-83) 
Com: 
  
where p.. — value of “on” mask, p. — value of “off” mask, g. - 
mean value of intensity values in the image covered by “on” 
mask, g_— mean value of intensity values in the image covered 
by “off” mask, m; — number of “on” pixels in the mask, m— 
number of “off” pixels in the mask, m — number of “on” and 
“off” pixels in the mask, 0, — standard deviation of intensity 
values covered by “on” mask, co. — standard deviation of 
intensity values covered by “off” mask. 
Four corner types are assumed: upper left, upper right, lower 
right and lower left. Each type is correlated with the whole 
image and pixels which result in a correlation coefficient higher 
than our detection threshold are selected. The selected pixels are 
coded with the orientation attribute of primitive instances 
(“upper left”, “upper right”, “lower right” and “lower left”) and 
with its correlation coefficient c. In Fig. 2 exemplary corner 
detection is presented. For the red, green, blue and yellow 
pixels the correlation coefficient was higher than the detection 
threshold. Colours encode the orientation attribute. For a typical 
facade image some 20,000 such pixels remain from texture of 
e.g. in this case 1024 x 524. 
  
Figure 2. Corners extracted in a thermal façade textures: red - 
“upper left”, green - “upper right”, blue - “lower right” and 
yellow - “lower left” 
It is obvious that for each corner perceived by a human observer 
several such pixels cluster together. Following Marr’s principle 
of avoidance of early decisions these objects would be entered 
into the production system as primitives — using an according 
clustering production as first knowledge source. However, this 
  
  
overloac 
followin 
perform 
compute 
are disp 
  
Figure 
From tl 
primitiv 
process 
22 Tv 
The p 
product 
groupin 
1) “can 
A faga 
horizon 
of same 
structur 
two L-] 
A caref 
that on 
Fig. 2 
system. 
Accord 
instanc 
interpr. 
2) Exp 
other c 
standar 
axes ol 
are inte 
be quit 
is der 
instanti 
be ins 
produc 
instanc 
namely 
used he 
Experi 
3) The 
rows fi 
higher 
orienta 
genera 
estima! 
of U-s 
and af 
height 
differe 
follow
1
2
...
307
308
309
310
311
...
586
587
Full text: Technical Commission III (B3)

Access restriction

Copyright

Note to user