Full text: Technical Commission III (B3)

and a 20% false alarm rate using HOG features. Future work 
will explore different combinations of feature, SVM parameters 
and kernel functions to maximise the correct detection rate while 
minimising false alarms. 
2.5 Range image based features 
Each pixel of the high resolution intensity images has co-registered 
with it a range value. This allows a range map of the intensity im- 
age to be calculated which is used as an additional feature in the 
learning process described in section 2.4. The range map is also 
used to segment the intensity image as it is expected that objects 
of interest (e.g. street furniture) are located within a certain dis- 
tance from the camera (the position of the camera is known a 
priori). The range map is thresholded to create a mask which is 
combined with the saliency response map created by any of the 
other methods to further reduce the area of the image to be passed 
to the next stage of processing. Figure 5 displays how the range 
map is combined with an edge-based saliency map to produce a 
final segmentation of the image. 
  
(a) Range map 
(b) Mask applied to original image 
Figure 5: Range based segmentation using simple thresholding 
on range map. 
3 OBJECT DETECTION 
The aim of an object detection system is to identify the cate- 
gory/class and location in an image of one or more objects of 
interest. The solution requires that the system internally repre- 
sents models of the categories of object to be identified so that it 
can compare these models to locations in previously unseen im- 
ages in order to identify when and if an instance of that model 
(an object) is present. Ideally one model should enable recogni- 
tion of all such objects in a category, and be robust to the great 
variation of objects possible within a given category, as well as 
the great variation in how these objects may appear in an im- 
age (different viewpoints, different scales, varying lighting con- 
ditions). This means that the system must minimise the false neg- 
ative detection rate. In addition, each model must be distinctive 
enough to preclude the possibility of confusing an instance of one 
model/class for another (or the null class representing no object). 
This is equivalent to minimising the false positive detection rate. 
The presence of range data with mobile mapping data should the- 
oretically allow for more accurate object identification because 
of the extra information available. In this paper, range informa- 
tion has been used to help segment the image into regions more 
likely to contain high-level features of interest. This range infor- 
mation can be further used to calculate geometric properties of 
the images and their content. 
3.1 Object geometry 
The identification of object edges, lines and corners can be used 
to infer the presence of straight lines or other geometric shapes 
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
in the image using feature extractors such as the Hough trans- 
form (Duda and Hart, 1972). If found together in non-random 
configurations, line features may be combined to form perceptual 
groups (Lowe, 1985). Once an object has been detected, such 
perceptual groups become doubly useful for the problem of ob- 
ject pose estimation. Figure 6 shows extraction of line informa- 
tion from a 3-D Earthmine image. These lines are first detected 
in the 2-D intensity image using a probabilistic version of the 
Hough transform to find line segments. Each point along a de- 
tected line is then queried against the co-registered range data. 
A line found in the 2-D image is rejected if the range along its 
length does not scale linearly. To allow for noise in the range 
information, a parameter specifies the degree of allowed range 
variation along the length of the line. The range points are fit 
to the 2-D lines using standard linear regression and end-points 
for the lines determined. It is possible to discriminate between 
edge type lines and intensity based lines by querying the linearity 
(in range) of short lines orthogonal to and crossing the detected 
line. Though providing quite a coarse estimation of scene geom- 
  
(a) Hough lines in original intensity (b) Line segments projected into 
image space via linear regression in range 
Figure 6: Line segments detected via Hough transform projected 
into 3-D space using linear regression in range. 
etry, these lines can later be used when comparing the geometric 
model of a learned class with detected objects to better approxi- 
mate their locations in space. 
3.2 Modelling Schema 
Many objects (such as people or animals) are highly articulated 
and any model of their appearance or geometry must be able to 
cater for the wide range of pose variation intrinsic to these types 
of object. Non-natural objects often have fewer individually mov- 
able components and there is far less variation in how adjacent 
parts of the same object appear in relation to one another. 
Methods based on pictorial structures (Felzenszwalb and Hutten- 
locher, 2005) and deformable parts-based models (Felzenszwalb 
et al., 2010) have demonstrated success in their ability to detect 
objects even when viewed from an unusual viewpoint, or when 
their parts are obscured due to occlusion with other scene ele- 
ments or their location at the edge of an image. A model is a 
hierarchy of parts where a single part is the child of a root part 
having features computed at half the resolution of the parts. The 
placement of each of the parts in the model is conditionally in- 
dependent of its sibling parts given its root. Figure 7 shows an 
example of a deformable parts model for the side and front view 
of a car using HOG features. 
3.3 Detection Method 
Modelling of independent object parts using HOG features has 
been used in this paper to detect cars in intensity images using 
variant of the approach described by Felzenszwalb et al, (2010). 
  
    
   
  
   
  
  
   
  
   
   
   
   
   
  
    
   
   
   
    
   
   
  
  
   
  
  
  
  
  
  
  
  
    
   
  
  
    
    
  
  
   
  
   
   
   
   
   
   
  
   
   
   
   
   
   
   
  
    
   
    
   
    
  
     
   
  
Figur 
tures 
The i 
mid « 
each 
objec 
result 
is rep 
pyrar 
detec 
the re 
best « 
matc| 
over 
using 
learn 
ing d 
root | 
and à 
vidu: 
for e; 
origi! 
ing b 
34 
Figui 
tensi 
to de 
rithn 
parts 
à par 
resul 
This 
tecti 
of d; 
task 
tion. 
data 
requ 
salie 
over 
imag 
ture: 
Whic 
fort 
vect 
inter 
Cros 
canc 
ject 
dete 
the 
  
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.