XXII ISPRS Congress 2012: Technical Commission III

    
   
   
   
   
  
  
   
  
   
  
  
   
   
  
   
   
   
   
     
    
   
   
   
   
   
   
    
   
   
   
   
  
    
   
   
   
     
   
   
     
    
   
  
  
  
   
   
   
    
      
[SM 
1ina 
and more 
ition, and 
attention 
le. In our 
e placed. 
signs. In 
; saliency 
d into two 
ase, they 
transform 
on phase, 
rt vector 
eusen etc 
ey detect 
signs also 
tication'?l, 
to detect 
| scale of 
|a novel 
the 64- 
and 16- 
rm” Liu 
histogram 
ian visual 
tention at 
n we saw 
ency area 
according 
> saliency 
s such as 
earch etc. 
ience and 
people's 
| intensity 
computer 
lines like 
principle 
take into 
ally have 
the visual 
ection. In 
affic sign 
| to detect 
safety. In 
addition to, a new generated method of saliency map based on 
visual contrast is also introduced. 
2. METHODOLOGIES 
2.4. Two-way Integration Method of Target Detection 
Modern physiology and psychology in visual research 
shows that the visual process is a integrated process including 
both bottom-up and top-down!! !!, so our visual attention model 
is two-way integration. In Top-Down phase, we choose Simple 
Vector Filter proposed by T. Asakura etc?! as a priori guide. 
The filter can highly extract specific colour and remove profile 
and have good segmentation results for red, blue and yellow. In 
Bottom-Up phase, we use a new saliency analysis method based 
on visual contrast. 
  
Visual attention model | Candidate Target 
5 = = - L| region area 
Top-Down Analyze | | Tl 1 
mL | P[RO 
| Color-based | =» L| 0 LL—————4 p 
||| Segmentation | | | Z | oo [Ti 
[ HIVER | | 8 | Analysis DR 
/ Input J| t É | it my and (= [T 
/ Optical ML = =» —— | Detection of [m 
f mes [i Borem-uc 17 | |B] | taskerelated | 
= 1908 attention | z TES targets | | 
À x | | 
[ |  Biologically - 2 PE D ue pm 
| Yom. J ITE Be 
| Mechanism |o alo mem 
CS 
  
Fig.1. The detection flowchart of traffic sign 
The flowchart of our method is shown in Fig.l. Firstly, the 
whole scene will be analyzed by visual attention model based 
on two-way integration. We may acquired many saliency areas, 
i.e. candidate areas., because there are many object in the scene 
and they maybe have the same saliency with task-related objects, 
Secondly, these candidate area will be analyzed according to 
the shape characteristics of task-related objects to acquire the 
needed target area". 
2.) Physiology background of saliency analysis 
The phenomenon that the retina will strongly respond to large 
contrast visual stimulation and the generation mechanism of 
visual information in the primary visual cortex can be simulated. 
we propose a method generating saliency map according to the 
cognitive neuroscience research. The method includes two 
layers computational unit and they correspond to simple cell 
and complex cell in primate primary visual cortex. 
S Unit: Human retina RF will strongly respond to the highest- 
contrast visual information, e.g., the center is light but 
surrounding is dark. The biological characteristics can be 
simulated by using difference operation between central high- 
resolution layer and surrounding slow-resolution layer!!! 
Primate primary visual cortex contains simple cells and 
complex cells. Some studies suggest that the receptive fields of 
the simple cell only include a small part of the vision, these 
local units must be pooled together by visual system in order to 
perceive the target within vision. Complex cells are the 
nonlinear spatiotemporal integration of simple cells!'*. In this 
research, we use contrast as a local saliency, i.e., the contrast 
information given by retinal are local contrast information 
generated by simple cell in primary visual cortex. And then, 
these local contrast information are integrated together to form 
the global information generated by complex cell. 
M us |i» 
| s Si | 1 
‚Max {Si} Y | (ot) 
Ime = ——— pt 
| | 
02) (02 04) (02) (02) (04) (02 (02 0.4 
Fig.2. Flowchart of image attention analysis 
C Unit: C unit are pooled from the S unit. The computing pool 
model is the bridge between complex and simple cells in 
primate primary visual cortex. As shown in Fig.2, there are 
three basic computing pool model for integrating local units is 
proposed in work”. je. Maximum model, Energy model and 
Half-wave model. Some experimental evidence in favor of the 
max operation has already appeared! !7), So, we choose out the 
max model to pool from complex cells to simple cells. 
  
| Bottom-Up Visual attention ~~» Max 
| — —» Center-Surround 
| i 
| S1 Unit C1 Unit 
      
  
  
MAX 
{Down Sampling) ( Generating / Generating —-—-» Max 
«. S1 Unit J cr Unit | | ——» Center-Surround 
  
Fig.3. Flowchart of image attention analysis 
Our approach is summarized in Fig.3, within the workflow of image 
attention analysis, an input image passes through two parts, S unit, and 
C unit. They respectively correspond to simple and complex cells. 
After down-sampling and the central-surround operation, we can get 
the local contrast map in S unit, and then the max model is employed 
to pool from the local contrast map to the global contrast map in C unit. 
For further highlighting the saliency areas, the global contrast map is 
smoothed with Gaussian filter in order to acquire the saliency map. 
Subsequently, the generic threshold segmentation is used to detect the 
object in the saliency map, where threshold is three times the average 
intensity ofthe saliency map. 
2.3 Computational step of saliency map 
To sum up, a novel method of detecting the saliency object from the 
image has the following specific steps: 
1) To generate an image pyramid. Down-sampling the original 
image I to create the Gaussian pyramid / (0) , where o is the layer 
of the image pyramid. The layer o is set to 4, and thus c e[1..4]. 
The first layer is the quarter of the size of the original image, the next 
layer is a half of the upper layer, and for instance the ration of the 
image /(1) and the image /(4) is 1/8. 
2) To generate S unit. We use center-surround operation to deal 
with 4-layer images in the pyramid and the result is used as the visual 
information of S unit. A “center” fine scale ce{1,2,3} and a 
“surround” coarser scale 5 € {2,3,4} (s=c+6,6=1) , surround 
layer s are interpolated to the scale of the central layer c, and then the 
point by point difference operation is used to get three difference
1
2
...
224
225
226
227
228
...
586
587
Full text: Technical Commission III (B3)

Access restriction

Copyright

Note to user