Full text: Technical Commission III (B3)

  
  
   
  
  
  
  
  
   
   
  
  
  
   
   
   
   
  
  
  
  
  
  
  
   
  
  
   
    
  
   
  
   
   
  
   
  
   
   
   
   
   
   
   
  
   
  
  
  
  
  
  
  
  
  
  
  
   
    
. Assessing the 
- intensity data. 
metry, Remote 
3), 259-262. 
. morphological 
aled, geometric 
28(4), pp. 626- 
;, Jie, Z.., 2000. 
| for urban land 
Proceedings of 
  
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
INTENSITY AND RANGE IMAGE BASED FEATURES FOR OBJECT DETECTION IN 
MOBILE MAPPING DATA 
Richard Palmer’, Michael Borck!, Geoff West! and Tele Tan? 
! Department of Spatial Sciences, Department of Computing 
Curtin University, GPO Box U1987, Perth 6845, Western Australia 
{r.palmer, michael.borck } @postgrad.curtin.edu.au, {g.west, t.tan} @curtin.edu.au 
Cooperative Research Centre for Spatial Information 
Commission I11/4 
KEY WORDS: low-level features, image processing, point clouds, mobile and terrestrial mapping, 3-D features, 2-D features 
ABSTRACT: 
Mobile mapping is used for asset management, change detection, surveying and dimensional analysis. There is a great desire to 
automate these processes given the very large amounts of data, especially when 3-D point cloud data is combined with co-registered 
imagery - termed “3-D images”. One approach requires low-level feature extraction from the images and point cloud data followed 
by pattern recognition and machine learning techniques to recognise the various high level features (or objects) in the images. This 
paper covers low-level feature analysis and investigates a number of different feature extraction methods for their usefulness. The 
features of interest include those based on the “bag of words” concept in which many low-level features are used e.g. histograms of 
gradients, as well as those describing the saliency (how unusual a region of the image is). These mainly image based features have 
been adapted to deal with 3-D images. The performance of the various features are discussed for typical mobile mapping scenarios and 
recommendations made as to the best features to use. 
1 INTRODUCTION 
Laser scanning is currently the averred method for the collec- 
tion of surveying/mapping data but increasingly this is being aug- 
mented by 2-D imaging cameras. Co-registration of 2-D colour 
intensity maps collected from standard cameras with range mea- 
surements collected by laser scanners results in the creation of 
3-D images; 2-D images with every pixel having an associated 
range value. Recently, mapping systems based on stereoscopic 
imaging techniques have been used to produce similar 3-D im- 
ages at the expense of reduced accuracy in range. The increasing 
use of mobile mapping systems based around such technology 
is resulting in the creation of very large amounts of data; mobile 
mapping systems operating along roads in urban centres typically 
collect full 360 degree panoramas every five or ten metres along 
the vehicle track. These datasets are very useful for a range of 
content analysis applications, but the speed of analysis is severely 
limited by the amount of costly and impractical manual process- 
ing needed to identify interesting features. There is a great need 
to improve upon the automated detection of content that is of in- 
terest to the user, so that a large proportion of time is not wasted 
looking through irrelevant data. 
Processing data for the automatic identification of features or ob- 
jects of interest is a core focus of computer vision research. Re- 
search has focussed on the analysis of very large cohorts of im- 
ages because many people and organisations produce and share 
Images and these must often be indexed and organised accord- 
Ing to content. Websites such as Flickr (http://www.flicker.com) 
and Picasa (http://picasa.google.com), and the need to search the 
Web for images having specific content means many millions of 
Images must be processed. Mobile mapping imagery requires 
similar processing to discover content for use in application ar- 
cas such as asset management, change detection, surveying and 
dimensionality analysis. Mobile mapping data is distinct from 
regular 2-D imagery because of the availability of co-registered 
range information. This extra modality presents an interesting av- 
enue for research because it offers the possibility of significantly 
increasing the speed and accuracy of existing 2-D image based 
feature detection methods. 
Research into object detection has produced a large number of 
novel approaches to feature detection. The performance of fea- 
tures extracted from imagery is evaluated for a particular object 
detection task. This requires a task driven approach to the evalu- 
ation of features by first identifying the type of object in the im- 
agery to be detected, before determining how accurate the object 
detection system that uses these features is in detecting the ob- 
jects. Typically this requires much imagery with ground-truthed 
bounds defined around the objects to be detected. While there 
exists much intensity imagery (e.g. the PASCAL Visual Object 
Classes Challenge (Everingham et al., 2010)), there are no similar 
commonly available 3-D or range image datasets. 
For the purposes of this research, a dataset from Earthmine was 
used consisting of a sequence of panoramas taken approximately 
every ten metres along the road within the Perth CBD, Western 
Australia. Each panorama consists of eight images projected onto 
the inside of a cube centred on the imaging camera array mounted 
on the mapping vehicle. Within the high resolution images, each 
colour pixel has co-registered against it the real world latitude, 
longitude and elevation at that point. 3-D images can be gener- 
ated specifying the colour and range of each pixel in the image. 
Range image data has been used for object representation and to 
establish correspondences between an object’s geometric model 
(e.g. derived from a generic CAD model of the object) and the 
object’s representation in the range imagery (Arman et al., 1993), 
(Lavva et al., 2008), (Steder et al., 2009). However, due to the 
complexity and slowness of matching spatial models in range 
imagery, and the wide availability of intensity imagery, research 
has favoured extracting the appearance of an object to encode 
its discriminative qualities. In intensity images, keypoints or in- 
terest points have been proposed such as Harris keypoints (Har- 
ris and Stephens, 1988), SIFT (Lowe, 2004), SURF (Bay et al., 
2006), and FAST (Rosten, 2006); blob detectors such as Max- 
imally Stable Extremal Regions (MSER) (Matas et al., 2002), 
   
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.