XXII ISPRS Congress 2012: Technical Commission IV

  
Extended Kalman Filter (Smith and Cheeseman, 1986) using 
maximum likelihood data association. The EKF SLAM is 
subject to a number of approximations and limiting assumptions. 
Maps are feature-based, which are composed of point 
landmarks. For computational reasons, the number of the point 
is usually small. The EKF SLAM is strongly influenced by the 
detection of the point landmarks. 
FastSLAM (A Factored Solution to the SLAM) algorithm 
apples particle filter to online SLAM (Montemerlo et al., 2002). 
The particle filter is a method as a stochastic process, 
integrating observations and state transition on the framework 
of general state space model (Isard and Blake, 1998; Doucet et 
aL, 2000). Advantage of the FastSLAM is efficiency of the 
computational load compared with the EKF SLAM. More 
important advantage is the following. EKF SLAM can basically 
deal with normal probability distribution. On the other hand, 
FastSLAM (or particle filter) can deal with arbitrary probability 
distribution and nonlinear model. 
Augmented reality community has also attempted to deal with 
the SLAM problem. One of the most popular methods among 
SLAM in the field is Parallel Tracking and Mapping (PTAM) 
(Klein and Murray 2007). It consists of exterior orientation 
based on feature points extraction and tracking in a sequential 
image (tracking process), and three dimensional coordinates 
estimation of the feature points (mapping process). The method 
performs the above two processes in parallel threads on dual- 
core computer separately in real time. Three dimensional 
models are superimposed on the mapping result. For the real 
time processing, a plane in the scene (as a ground plane) is 
estimated by using reliable mapped points, the three 
dimensional models are arranged on the plane. This paper 
focuses on the method, and the next section will explain the 
outline of the method and its applicability. 
3. APPLICABILITY OF SLAM 
3.1 Outline of the SLAM Method 
The outline of the method is summarized by the following 
points (Klein and Murray 2007): 
(a) Tracking and mapping are separated; 
(b) Mapping is based on key frames, which are processed using 
batch techniques (bundle adjustment); 
(c) The map is densely initialized from a stereo pair; 
(d) New points are initialized with an epipolar search; 
(e) Large numbers of points are mapped. 
3.1.1 Camera Tracking: Camera calibration is conducted in 
advance. For the calibration, checkerboard is used (Figure 1). 
Main process of the method is started by feature points 
extraction and tracking. The map is represented by M feature 
points, which have coordinates p; = (X; Y; Zj) and normal 
vector n; of the image patch in the world coordinates system. 
For each feature point, FAST (Features from Accelerated 
Segment Test) corner detector (Rosten and Drummond, 2006) is 
utilized. The FAST corner detector recognizes corners, when 
the pixel value of the centre of image patch is brighter than ones 
of contiguous pixels. 
The map has N key frames which are snapshots of a sequential 
image. The key frame has a three dimensional coordinates 
q, = (X, Y,, Z;) as the camera position. At the key frames j, the 
feature point i has camera coordinates system (uy, vj). 
transformation matrix Ej; between the camera coordinates and 
the world coordinates systems represents collinearity equation. 
  
  
AN m RC ax, -X, asi, —Y,)+au(Z; -Z,) 
HE KEN aD ZZ) DD 
5 =» A = + ay (x, -X,) autt, —Y,)+anlZ; -Z,) 
e teca X HS) Z,-Z;) 
where c = focal length 
Au y , Av, = factors of interior orientation 
ay = - factors of rotation matrix 
The key frames are also converted to image pyramid. 
  
Figure 1. Camera calibration by using checkerboard 
The camera tracking process (estimation of camera position and 
pose) performs the following two-stage tracking: 
(a) A new frame is acquired from a camera; 
(b) Initial position and pose of the camera are estimated by 
camera transition model; 
(c) Feature points in the map are projected into the image 
according to the frame's prior position and pose estimations, 
which have transformation matrix between the world 
coordinates and the camera coordinates including interior 
orientation factors; 
(d) A small number of the coarsest-scale features are searched 
for in the image; 
(e) The camera position and pose are updated from these coarse 
matches; 
(f) A larger number of points is re-projected and searched for in 
the image; 
(g) A final position and pose estimates for the frame are 
computed from all the matches found. 
In order to search corresponding feature points between frames, 
affine warp characterized by a warping matrix À is used. 
fn, On 
.|04, êv, (2) 
low ov 
Ou, Ov, 
where (4, v,) correspond to horizontal and vertical pixel 
displacements in the patch’s source pyramid level, and (ue, Ve) 
174 
cor 
full 
can 
The 
min 
Tuk 
robi 
of r 
estir 
31. 
posc 
poin 
inter 
inter 
appl 
not ¢ 
Afte 
key 
fram 
fram 
the | 
(Trig 
prob 
2004 
the f 
(7 
wher 
Ther: 
adjus 
adjus 
adjus
1
2
...
185
186
187
188
189
...
544
545
Full text: Technical Commission IV (B4)

Access restriction

Copyright

Note to user