Full text: Technical Commission III (B3)

    
     
    
   
     
  
  
  
  
   
    
  
  
  
  
  
  
  
  
  
  
   
     
    
     
     
   
   
   
   
    
  
    
    
   
   
   
    
   
   
  
  
of Thessaloniki, 
photo grammetry 
1e McGraw-Hill 
OS%20Digital% 
. 
[ 
lios Tsioukas for 
d supplying the 
ke to thank Mr. 
g environment in 
aboratory of the 
d for his great 
ACCURACY TEST OF MICROSOFT KINECT FOR HUMAN MORPHOLOGIC 
MEASUREMENTS 
B. Molnár ^, C. K. Toth?, A. Detrekói? 
* Department of Photogrammetry and Geoinformatics 
Budapest University of Technology and Economics, Müegyetem rkp 3., Budapest, H-1111, Hungary - 
molnar.bence@fmt.bme.hu 
^ The Center for Mapping, The Ohio State University 
470 Hitchcock Hall, 2070 Neil Avenue, Columbus, OH 43210 - toth@cfm.ohio-state.edu 
Commission TCs III and V 
KEY WORDS: Flash LiDAR, MS Kinect, point cloud, accuracy 
ABSTRACT: 
The Microsoft Kinect sensor, a popular gaming console, is widely used in a large number of applications, including close-range 3D 
measurements. This low-end device is rather inexpensive compared to similar active imaging systems. The Kinect sensors include an 
RGB camera, an IR projector, an IR camera and an audio unit. The human morphologic measurements require high accuracy with 
fast data acquisition rate. To achieve the highest accuracy, the depth sensor and the RGB camera should be calibrated and co- 
registered to achieve high-quality 3D point cloud as well as optical imagery. Since this is a low-end sensor, developed for different 
purpose, the accuracy could be critical for 3D measurement-based applications. Therefore, two types of accuracy test are performed: 
(1) for describing the absolute accuracy, the ranging accuracy of the device in the range of 0.4 to 15 m should be estimated, and (2) 
the relative accuracy of points depending on the range should be characterized. For the accuracy investigation, a test field was 
created with two spheres, while the relative accuracy is described by sphere fitting performance and the distance estimation between 
the sphere center points. Some other factors can be also considered, such as the angle of incidence or the material used in these tests. 
The non-ambiguity range of the sensor is from 0.3 to 4 m, but, based on our experiences, it can be extended up to 20m. Obviously, 
this methodology raises some accuracy issues which make accuracy testing really important. 
1. INTRODUCTION 
The superior performance and efficiency have made 
laserscanning systems the primary source for 3D measurements. 
Main LiDAR methods are well explained by Shan and Toth 
(Shan and Toth, 2008). The two typical LiDAR platforms are 
airborne and terrestrial (TLS) laserscanning (Vosselman, 2010), 
though mobile LiDAR (MLS) is gaining rapid acceptance. 
These methods use pulsed-based technology with discrete 
return detection or waveform recording recently. For close 
range LiDAR scanning, Flash LiDAR is increasingly used. This 
technology is based on a sensor array, which makes it possible 
to measure multiple ranges at the same time. The range of the 
captured depth image is mainly limited based on the emitted 
impulse power. The frequency is also somewhat limited for 
eyesafety and technological reasons. For example the early 
Flash LiDAR model, the SWR3000 (Kahlmann et al., 2006) is 
based on CW approach, offering an operating range up to 7.5 m 
and a frame rate of 15 Hz. The newer PMD [vision] CamCube 
2.0 has a range 0.4 to 7 m and 25 fps (PMD). 
Successful facial reconstruction requires an appropriate model 
of the human face. Therefore, a wide range of data collection 
procedures have been developed, mostly based on 
photogrammetry (Schrott ef al., 2008). Flash LiDAR is a good 
alternative for surface point gathering methods. In addition, it is 
fast data acquisition. The post processing and model creation, 
however, require some specific knowledge, as the human face 
has special surface conditions (Aoki et al., 2000). The 
developed model provides a good base for plastic surgery. 
2. MICROSOFT KINECT SENSOR 
The Kinect" sensor is a motion sensing input device for the 
Xbox 360 video game console, originally developed by 
PrimeSense (PrimeSense), and acquired by Microsoft”. The 
primary purpose is to enable users to control and interact with 
the Xbox 360 through a natural user interface using gestures 
and spoken commands without the need to touch a game 
controller at all. The Kinect has three primary sensors: a Flash 
LiDAR (3D camera), a conventional optical RGB sensor (2D 
camera), and microphone array input. The device is USB- 
interfaced, similar to a webcam, and appears as a "black box" 
for the users. 
Very little is known of the sensors, internal components and 
processing methods stored in the firmware. The laser, IR, 
emitter projects a structured light pattern of random points to 
support 3D recovery. The 2D camera can acquire standard 
VGA, 640x480, and SXGA, 1280x1024, images at 30 Hz. The 
color formation is based on Bayer filter solution, transmitted in 
32-bit and formatted in the sRGB color space. The FOV of the 
2D camera is 57? x 43?. The 3D camera can work in two 
resolutions with frame sizes of 640x480 and 320x240, 
respectively. The range data comes in 12-bit resolution. The 
sensors’ spatial relationship is shown in Figure 1. The 
approximate distance between the laser emitter and detector that 
form a stereo par is about 7.96 cm, and the baseline between the 
2D and 3D cameras is about 2.5 cm. 
  
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.