8, 2012
3D point cloud
angulation, (c)
res in each im-
s performed us-
Murray, 1997).
the texture of
d for use in the
re implementa-
process images
tware package
ally construct a
hed image fea-
era poses and
mera. The re-
elative camera
position within
ol points corre-
fied in both the
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B8, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
(b) (c)
Figure 3: Study Site: (a) Cape Banks (shown in red box) at
Botany Bay, Sydney Australia, (b) existing aerial photography
of the area (courtesy of Google Maps), (c) ground-based photo
of the rock platform at low tide (photo taken week prior to data
collection).
3D point cloud and existing geo-referenced aerial photography
of the site were used via Horn's method (Horn, 1987) to compute
a transformation of the pointcloud into 3D geo-referenced coor-
dinates with absolute scale. Alternatively, ground control points
could have been measured at the site for example using a hand-
held Global Positioning System (GPS) receiver.
A multi-view stereo reconstruction algorithm (Furukawa and Ponce,
2010) based on the correlation score of dense patches in the over-
lapping images was then used to produce a dense 3D point-cloud
corresponding to a higher spatial resolution than by using SIFT
features alone. This algorithm used the relative camera poses es-
timated during bundle adjustment to triangulate dense image fea-
tures and robustly remove outliers from the terrain point cloud.
The resulting 3D pointcloud had a spatial density that depended
on the level of texture in the environment and was usually within
a small factor of the image pixel size (i.e. approximately one 3D
feature for every 5-by-5 pixel patch on average).
24 Photo-textured Terrain Model and Visualisation
A triangulated terrain surface model was constructed from the
3D pointcloud using Delaunay triangulation (Barber et al., 1996).
For each face of the surface, the images corresponding to the
coverage of the face were identified using the estimated relative
poses of each camera. The images were ranked based on the
distance between the point in the environment and the camera
centre (and thus image resolution at this point). Each face was
then applied with a photo-texture using a band-limited blending
(Johnson-Roberson et al., 2010) of the closest four image patches
at the surface face. The final 3D model was visualized using a
level-of-detail rendering system (Johnson-Roberson et al., 2010)
(to capture sub-centimeter details over the entire span of the map)
and was used to additionally construct an orthographically recti-
fied photo-mosaic of the area.
The 3D model building process is illustrated in Figure 2. Figure 2
(a) shows the initial 3D point cloud after multi-view stereo recon-
Struction is applied. Figure 2 (b) illustrates the 3D surface mesh
Constructed from triangulation. Figure 2 (c) shows the final 3D
Figure 4: Photo-mosaic Reconstruction of an Intertidal Rockflat:
(a) Existing aerial photography of the area (courtesy of Google
Maps), (b) constructed photo-mosaic using kite-based images and
processing pipeline.
photo-textured model. The terrain reconstruction algorithms ben-
efited from the large degree of overlap in the imagery; the view
selection and band-limited-blending allowed for only the best im-
ages of a given surface to be used in the final model, providing
leeway for images taken from poor angles or with disturbances or
occlusions such as shadowing.
3 RESULTS AND DISCUSSION
3.1 Experimental Setup
Experiments were performed over an intertidal rock platform at
Cape Banks (34.000°S, 151.249? E) on the north edge of Botany
Bay, Sydney, Australia (see Figure 3). The site lies within a na-
tional park aquatic reserve and is host to various intertidal species
such as micro- and macro-algae, gastropods, snails and cunjevoi.
Data collection was performed during low tide on a clear sunny
day around midday to maximise image quality. Images were cap-
tured continuously at an altitude of approximately 15-20m as the
kite line was walked across a 100m-by-20m section of a rocky
platform. The time taken to acquire images across the platform
was approximately five minutes.
2770 of the collected images were processed using the photogram-
metric processing pipeline described above. The entire process-