Extended Kalman Filter (Smith and Cheeseman, 1986) using
maximum likelihood data association. The EKF SLAM is
subject to a number of approximations and limiting assumptions.
Maps are feature-based, which are composed of point
landmarks. For computational reasons, the number of the point
is usually small. The EKF SLAM is strongly influenced by the
detection of the point landmarks.
FastSLAM (A Factored Solution to the SLAM) algorithm
apples particle filter to online SLAM (Montemerlo et al., 2002).
The particle filter is a method as a stochastic process,
integrating observations and state transition on the framework
of general state space model (Isard and Blake, 1998; Doucet et
aL, 2000). Advantage of the FastSLAM is efficiency of the
computational load compared with the EKF SLAM. More
important advantage is the following. EKF SLAM can basically
deal with normal probability distribution. On the other hand,
FastSLAM (or particle filter) can deal with arbitrary probability
distribution and nonlinear model.
Augmented reality community has also attempted to deal with
the SLAM problem. One of the most popular methods among
SLAM in the field is Parallel Tracking and Mapping (PTAM)
(Klein and Murray 2007). It consists of exterior orientation
based on feature points extraction and tracking in a sequential
image (tracking process), and three dimensional coordinates
estimation of the feature points (mapping process). The method
performs the above two processes in parallel threads on dual-
core computer separately in real time. Three dimensional
models are superimposed on the mapping result. For the real
time processing, a plane in the scene (as a ground plane) is
estimated by using reliable mapped points, the three
dimensional models are arranged on the plane. This paper
focuses on the method, and the next section will explain the
outline of the method and its applicability.
3. APPLICABILITY OF SLAM
3.1 Outline of the SLAM Method
The outline of the method is summarized by the following
points (Klein and Murray 2007):
(a) Tracking and mapping are separated;
(b) Mapping is based on key frames, which are processed using
batch techniques (bundle adjustment);
(c) The map is densely initialized from a stereo pair;
(d) New points are initialized with an epipolar search;
(e) Large numbers of points are mapped.
3.1.1 Camera Tracking: Camera calibration is conducted in
advance. For the calibration, checkerboard is used (Figure 1).
Main process of the method is started by feature points
extraction and tracking. The map is represented by M feature
points, which have coordinates p; = (X; Y; Zj) and normal
vector n; of the image patch in the world coordinates system.
For each feature point, FAST (Features from Accelerated
Segment Test) corner detector (Rosten and Drummond, 2006) is
utilized. The FAST corner detector recognizes corners, when
the pixel value of the centre of image patch is brighter than ones
of contiguous pixels.
The map has N key frames which are snapshots of a sequential
image. The key frame has a three dimensional coordinates
q, = (X, Y,, Z;) as the camera position. At the key frames j, the
feature point i has camera coordinates system (uy, vj).
transformation matrix Ej; between the camera coordinates and
the world coordinates systems represents collinearity equation.
AN m RC ax, -X, asi, —Y,)+au(Z; -Z,)
HE KEN aD ZZ) DD
5 =» A = + ay (x, -X,) autt, —Y,)+anlZ; -Z,)
e teca X HS) Z,-Z;)
where c = focal length
Au y , Av, = factors of interior orientation
ay = - factors of rotation matrix
The key frames are also converted to image pyramid.
Figure 1. Camera calibration by using checkerboard
The camera tracking process (estimation of camera position and
pose) performs the following two-stage tracking:
(a) A new frame is acquired from a camera;
(b) Initial position and pose of the camera are estimated by
camera transition model;
(c) Feature points in the map are projected into the image
according to the frame's prior position and pose estimations,
which have transformation matrix between the world
coordinates and the camera coordinates including interior
orientation factors;
(d) A small number of the coarsest-scale features are searched
for in the image;
(e) The camera position and pose are updated from these coarse
matches;
(f) A larger number of points is re-projected and searched for in
the image;
(g) A final position and pose estimates for the frame are
computed from all the matches found.
In order to search corresponding feature points between frames,
affine warp characterized by a warping matrix À is used.
fn, On
.|04, êv, (2)
low ov
Ou, Ov,
where (4, v,) correspond to horizontal and vertical pixel
displacements in the patch’s source pyramid level, and (ue, Ve)
174
cor
full
can
The
min
Tuk
robi
of r
estir
31.
posc
poin
inter
inter
appl
not ¢
Afte
key
fram
fram
the |
(Trig
prob
2004
the f
(7
wher
Ther:
adjus
adjus
adjus