XXII ISPRS Congress 2012: Technical Commission III

  
3D shape alignment. And ICP has been shown to be effective 
when two point clouds are nearly aligned. Since we have two 
frames aligned in the process of SIFT+RANSAC the 
prerequisites for ICP algorithm has been satisfied. To generate 
more accurate alignments than point-to-point ICP an ICP 
variant based on point-to-plane error metric has been shown to 
improve convergence rates and is the preferred algorithm when 
surface normal measurements are available (Rusinkiewicz, 2002) 
(Segal, 2009). In the previous section we have mentioned that 
virtual depth image is more accurate and less noisy compared to 
raw depth image. So in this part the point-to-plane ICP will be 
applied between virtual depth image and current RGB-D frame. 
In the first iteration of ICP algorithm (R; T) is initialize by 
SIFT+RANSAC match. When the point-to-plane error metric is 
used, the object of minimization is the sum of the squared 
distance between each source point and the tangent plane at its 
corresponding destination point. More specifically, if s;— (Sis, Siy, 
Sim 1)" is a source point, d = (dix, diy, dip 1)' is the 
corresponding destination point, and n — (ni, niy, ni; 1)" is the 
unit normal vector at then the goal of each ICP iteration is to 
find (Rope; Tope) such that (Low, 2004) 
(R,,;7,,) -argmin, Y ((R;T)es,-d)eny © 
After the registration of 3D point clouds the final transformation 
(R^ ; T) is computed. 
4.3.3 Color Similarity Measurement 
The framework above only utilizes little part of pixels 
corresponding to SIFT feature in the depth image. It is assumed 
that if (R^; T^) applied on frame pairs common areas should 
overlap perfectly. However the rigid transformation may be 
unreliable under difficult circumstances. So it is not always the 
case in practical situations. To compute color similarity, we 
choose a set of points from RGB image including all SIFT 
feature points and some other visual features such as Harris 
Descriptor. SIFT features often locate at the edge of object 
while point clouds are not sensitive in these areas. So we put 
larger weight on those pixels corresponding to SIFT features in 
color similarity measurement. 
Every feature point has information including location, gradient 
magnitude and orientation. For each image sample, L(x, y), the 
gradient magnitude, m(x, y), is precomputed using pixel 
differences to produce weight W(x, y): 
  
m(x, y)= AG Ly-I-Ly»y-UG,yr)-Io,y-Dy © 
W(x,y)=1/m(x,y) (6) 
To measure color similarity coefficient method is used. First we 
set F' as master image and S! as slave image and get pixels 
corresponding to SIFT features from both RGB data. The 
difference is that pixel window in F' is 4*4 and F! larger 16*16. 
The coefficient of the stereo-pair pixel of matching window and 
the target window can be calculated by formula below. This 
final coefficient r is the max value of the window. 
Yo, -Xxy-Y) (7) 
i=l 
Ea ATIS Y] 
i=l i=l 
Combining with the pre-defined weight we sum all of those 
coefficients up. Each sample frame obtained in section 1.1.2 
there will be a coefficient. So for the reason to compare color 
similarity S we have to normalize the coefficient value: 
  
r= 
271 
M (8) 
Y W(x, y) 
i=l 
5. RESULTS & DISCUSSION 
We have conducted a number of experiments to investigate the 
performance of our system. These and other aspects, such as the 
system's ability to keep track during very rapid motion and the 
performance of automatic relocalization, are tested. In our 
experiment an indoor space is reconstructed. Figure 1 shows an 
example frame observed with this RGB-D camera. 
  
Figure 2: (left) RGB image and (right) depth information captured by an 
RGB-D camera. Black pixels in the right image have no depth value, mostly 
due to max distance, or surface material. 
  
Figure 3: Demonstration of the reconstructed 3D model. The colored 
points in the middle linked as a polygonal line are representatives of sample 
frame. They are also the representatives of camera positions. 
During mapping, the camera was carried by a person, 
meanwhile to test the performance of automatic relocalization 
the camera was moved shiftily. As shown in Figure 3 that there 
is no explicit “holes” or “ghost image” existing in the 
reconstructed model. Some holes on the edge of object are 
caused by the missing of data information where camera cannot 
reach. In our experiment camera tracking failure happened. 
However the system only takes some milliseconds to re- 
initialize camera position. So the efficiency of our method to 
achieve camera relocalization has been proven. 
6. CONCLUSION 
Building accurate, dense models of indoor environments has 
many applications in robotics, gaming. In this paper We 
investigate how potentially inexpensive depth cameras-Kinect- 
can be utilized to reconstruct 3D model using voxel-based 
method. To maintain the stability of our system graph-based 
method along with SIFT and Colour Similarity Measurement 
has been proposed. And we get a prospective result of camera 
relocalization in 3D reconstruction process. 
REFERENCES
1
2
...
282
283
284
285
286
...
586
587
Full text: Technical Commission III (B3)

Access restriction

Copyright

Note to user