Figure 1: Examples of aerial images captured from the kite-based
imaging platform (taken from approximately 15m altitude).
strated from experiments at an intertidal rock platform at Cape
Banks, Sydney, Australia in Section 3. Conclusions and future
work are discussed in Section 4.
2 METHODOLOGY
2.1 Kite-based Image Acquisition
A kite-based imaging system was built that used a 2.7m wingspan
conynes-delta kite to lift a fixed, downwards-looking rig hold-
ing a consumer-grade digital camera. The conynes-delta kite was
chosen for its stability and lifting capacity in a wide range of wind
conditions. The camera was suspended from a Picavet rig which
attached to the line of the kite approximately 10m lower than the
kite to minimise the impact of wind gust-induced motion of the
camera. The Picavet provided mechanical levelling of the camera
during changes in the flying angle of the kite. The camera used
was a Sony NEX-3 with a 16mm pancake lens which provided an
imaging field of view of approximately 73-by-52° with a resolu-
tion of 4592-by-3056 pixels. The camera was chosen as a trade-
off between the image quality achieved from a full-frame sen-
sor digital SLR and the light-weight of a small-sensored compact
digital camera. Figure 1 illustrates example images captured by
the system at an altitude of approximately 15m from the ground,
with a coverage footprint of approximately 22-by-15m.
During data collection, the kite was used to hoist the camera rig
over the area of interest and the camera programmed to capture
images at a frequency of approximately one shot per second. The
kite could be flown at a variety of altitudes between approxi-
mately 10-100m, based on desired area coverage and ground spa-
tial resolution and limited by the length of the kite line. The de-
sired height was achieved using distance markers on the line and
by approximating the flight angle of the kite. The kite was then
slowly walked across the terrain allowing multiple overlapping
images to be captured of the entire area of interest.
2.2 Image Processing, Feature Extraction and Matching
After data collection, images were copied from the camera to
a desktop computer for processing. Images that were affected
by motion blur during wind gusts or large occlusions of the ter-
rain (for example images of people moving in the scene) were
removed manually before the processing began. Scale-Invariant
Feature Transform (SIFT) features (Vedaldi and Fulkerson, 2008)
were extracted in each image and matched across all image pairs
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B8, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
Figure 2: Steps in Terrain Reconstruction: (a) 3D point cloud
generated from multi-view stereo, (b) surface triangulation, (c)
final photo-textured surface model.
using a kd-tree (Beis and Lowe, 2003) of the features in each im-
age. Robust detection of feature match outliers was performed us-
ing epipolar constraints between images (Torr and Murray, 1997).
SIFT features correspond to distinctive points in the texture of
surfaces captured in images and were highly suited for use in the
rocky intertidal environment. Multi-core software implementa-
tions of these methods were developed in order to process images
in parallel, speeding up processing times.
2.3 3D Pointcloud Reconstruction
A structure-from-motion/ bundle adjustment software package
(Snavely et al., 2008) was then used to incrementally construct a
3D point feature map corresponding to the matched image fea-
ture points while simultaneously estimating camera poses and
the intrinsic and extrinsic parameters for the camera. The re-
construction provided a point feature map and relative camera
poses with unknown scale, absolute rotation and position within
a geo-referenced coordinate system. Ground control points corre-
sponding to distinct rock features that were identified in both the
Inte
Figure 3:
Botany E
of the are
of the roc
collectior
3D point
of the site
a transfor
dinates w
could ha:
held Glol
A multi-y
2010) bas
lapping i
correspoi
features :
timated d
tures and
The resul
on the lex
a small f:
feature fc
24 Ph
À triangi
3D point
For each
coverage
poses of
distance
centre (a
then app!
(Johnson
at the su
level-of-
(to captu
and was
fied phot
The3Dr
(a) show:
Struction
construci