3. SENSOR CHARACTERSITICS
While Microsoft were the first to introduce a NUI sensor at a
consumer price to the mass market, it is important to understand
that they did not develop the sensor completely on their own.
The Kinect is a complex combination of software for gesture
recognition, sound processing for user voice locating and a 3D
sensor for user capture. The actual 3D sensor contained in the
Kinect is based on a system developed by PrimeSense and
implemented in a system on a chip (SOC) marketed by
PrimeSense under the name PS1080.
PrimeSense has licensed this technology to other manufacturers
as well, among them ASUS and Lenovo. A PrimeSense based
sensor is currently available in different products or in different
packages: the PrimeSense Developer Kit, the Microsoft Kinect
and the ASUS Xtion (see Figure 1).
PrimeSense describe their 3D sensor technology as
"LightCoding" where the scene volume is coded by near
infrared light. Without internal details being available the
system can be characterized as an active triangulation system
using fixed pattern projection. The fixed pattern is a speckle dot
pattern generated using a near infrared laser diode. The
triangulation baseline in-between projector and camera is
approximately 75 mm.
PrimeSense only give few specification of their reference sensor
design listed in Table 1. Notably any accuracy in z direction is
missing. Such performance criteria have to be established in
dedicated test, which we will come to in the next section. One
of the specifications given however is quite striking. While the
point sampling of a single depth frame is quite low at only
VGA resolution (640 x 480), this number has to be seen in
relation to the frame rate. If we multiply the number of points of
a single frame with the frame rate of 30 frames per second we
Figure 1. Overview of different products available using
PrimeSense's 3D NUI sensor technology
Field of View (Horizontal, i 3 T
Vertical, Diagonal) 58° H, 45° V, 70° D
Depth Image Size VGA (640x480)
Operation range 0.8m-3.5m
Spatial x/y resolution (@2m
; 3mm
distance from sensor)
Maximal image throughput
(frame rate) eus
Table 1. Specifications of the PrimeSense reference sensor
design as given by the manufacturer.
receive a sampling rate of 9216000 points per second. This
outperforms current terrestrial laser scanners by an order of
magnitude.
When we consider the maximum opening angle of the sensor
which in one direction is 58 degree (this determines the field of
view) it becomes clear by simple geometry that in order to
cover the full body of an average sized person on one side we
need a stand-off distance of approximately 1.8 m. Firstly when
we want coverage from all sides, for example from four sides,
this would lead to a very big footprint of a multi-sensor system.
But secondly we also must take some basic photogrammetric
rule of thumb into consideration.
As mentioned above the triangulation base is only 75 mm,
which naturally limits the distance which can be measured
reliably. While an exact limit to the base to height ratio cannot
be given in the general case, but needs to be established on a
case to case basis, we can assume as a rule of thumb that the
base to height ratio should not fall below 1:16 (refer for
example to Waldhäusl and Ogleby, 1994 or Luhmann, 2000).
With the given base of 75 mm we should therefore not exceed a
distance of approximately 1.2 m.
ani Sin
^ i0m >
Figure 2. Graphical comparison of one sensor at a larger stand-
off distance (top) or two sensors on top of each other
at a shorter stand-off distance (bottom).
Figure 2
stand-off
distance |
each oth
advantage
setup.
Since the
accuracy
suitable 1
research
designed
precision
on a sing
units in
tolerance:
additiona
under test
As the ro
shown, th
measuren
reasonabl
at approx
4.1 Rept
We test r
view of
measuren
recording
is the di:
constant «
only inter
value of t
Figure 3 .
leftmost :
approxim
approxim
Two furtl
51
51
51
50
50
50
50
50
Distance (mm)
Figure 4.