ume XXXIX-B3, 2012
of the study is the problem
vering these heavy vehicles
harvester operators strive to
for a second time. For this
GPS, IMU and/or odometry
anopy, drift of IMU without
ie problems, as it is largely
itergrund dieser Arbeit ist
se Fahrt über ungeschützten
die Fahrspur eines bereits
reich. Die Daten bestehen-
ehleranfällig. Eine Kamera,
ssen ist.
WORK
jased only on GPS and de-
amberger, 2001), caused by
ased approaches might aid
MU, to navigate in spite of
ents for that kind of naviga-
s with overlapping regions.
e reconstructed by relative
Feature detectors such as
rm) (Lowe, 2004) or SURF
et al., 2006), are scale and
e point detection. Structure
's, like Bundler (Snavely et
iges and produce 3D recon-
scene geometry. SfM tools
t to compute the results and
ect. Another approach re-
| time application for vision
napping (VSLAM). Hence,
h with a monocular device.
D DATA
have been captured with the
R 20mm wide-angle lens, as
| table 1). Shutter time and
otion blur is avoided. The
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
pictures have been taken in full resolution of 4256 x 2832 pixel,
but had to scaled down to a smaller size (see section 4) to comply
with requirements of feature detection algorithms.
(a) NIKON D700
(b) NIKON NIKKOR 20mm
Figure 1: Used camera and lens
| parameter || value |
resolution 4256 x 2832 pixel
sensor 24 x 36 mm CMOS chip
focal length 20mm
Table 1: Data of camera and lens
4 METHODOLOGY
In this experiments two different approaches of pose estimation
and 3D data acquisition in woodland with relative orientation
have been evaluated. Basically, known structure form motion al-
gorithms have been utilized, but here they are employed in the
context of forest environments, which previously did not receive
adequate attention. À REAL-TIME-PROCESS estimates the orien-
tation of two subsequent images with SURF (Bay et al., 2006) or
SIFT (Lowe, 2004) feature points and calculates the parameters
of motion (X,Y ,Z,w,p,x) between this sequence. As a second
approach, POST-PROCESSING calculates the position of images
with SIFT features in a combined orientation and bundle adjust-
ment (Bundler (Snavely et al., 2007) and (Lourakis and Argyros,
2009)). This bundle adjustment improves position and orienta-
tion of each image in sequence by more redundancy based on by
more observations and distributing residual errors in equal parts
on each point of view.
Figure 2: Example image of captured stack with 36 frames over-
all
4.1 Post Processing Approach
A post-processing algorithm can be employ to compute the cam-
era positions and orientations of images set after capturing over-
59
all. Further, some additional algorithms use those orientated im-
ages and their positions to estimate a dense point cloud of the
surrounding environment. The post-processing algorithm uses
the Bundler software (Snavely et al., 2007) for image orientation.
Therefore, a stack of images with overlapping areas is required.
Global unique feature points inside of covered regions are nec-
essary at first. These are located by a SIFT feature detector to
get a robust feature descriptor for this position. The SIFT feature
descriptor specifies a global unique feature and can be compared
with feature points of different images of the stack. An approx-
imate nearest neighbors (ANN) matching method among differ-
ent features delivers in combination with computed fundamental
matrix and RANSAC (Random Sample consensus) (Fischler and
Bolles, 1981) a robust estimator for parameters of relative ori-
entations. Furthermore, the bundle adjustment connects all de-
tected homologue feature points and approximated focal length
to reduce the error of observations using a least squares method
based on Gauss-Markov-Model. In addition to camera position
and 3D points, the software computes focal length and distor-
tion, thus it is not necessary to calibrate the camera before using.
Results of Bundler can be imported into PMVS2 (Patch-based
Multi-view Stereo) software (Furukawa and Ponce, 2010) to cre-
ate a dense point cloud of environment. "PMVS is a multi-view
stereo software that takes a set of images and camera parameters,
then reconstructs 3D structure of an object or a scene visible in
the images." (Furukawa, 2010)
4.2 Real Time Approach
A post processed localization and mapping assists in the estima-
tion of the driven path and computation of a 3D point cloud of
the environment after image acquisition. Nevertheless, in some
cases a real time analysis is required, as for instance a simulta-
neous localization and driving of forest vehicles. Such a method
could help restraining the forest machine to a given track and
consequently avoiding further soil compression. In comparison
to the approach of section 4.1, this real time algorithm computes
the orientation of images sequentially. That means, the pose es-
timation is computed gradually, from image to image, and the
actual position is accumulated by previous image positions. First
of all feature points are detected and feature descriptors are com-
puted. Following descriptors are matched to obtain homologous
point pairs in an overlapping image pair. In order to improve the
speed of this algorithm, SURF features are detected instead of
SIFT. RANSAC based on a computed fundamental matrix, helps
to find the best solution for orientation. Because of the over de-
termined equation system, an additional adjustment of the rela-
tive orientation is performed and uses all inliers determined with
RANSAC and increases the accuracy of parameters by minimiz-
ing the square of residuals. The number of tie points between
two images is essential to compute an accurate relative orienta-
tion. Tests with images, that contain trees, have shown that the
number of corresponding points strongly depends on the resolu-
tion of the image (Fig. 3). The reason for this behavior is founded
in the computed feature descriptor size and the fast background
color changing of features. So it may helpful to resample the
image to a smaller size (table 2). In this approach, image coor-
dinates are direct observations, so a calibration of the camera is
necessary and feature point positions have to be corrected for lens
distortion and principle point shift.
5 RESULTS AND DISCUSSION
The results of post (Sec. 4.1) and real time processing (Sec. 4.2)
are very different. Reasons for the different results are due to the
different ways of adjustment. Bundler uses a sparse bundle ad-
justment to enhance all extracted observations and approximated