The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B5. Beijing 2008
740
The motion analysis techniques were initially developed to
identify candidates for counting and sizing fish in aquaculture
(Harvey et al., 2004). Motion analysis is first used to identify
sections of the image sequences that contain features of interest,
effectively eliminating portions of the video that are devoid of
features and not of direct interest to habitat mapping. This
processing is effectively an image compression technique that
dramatically reduces the amount of video sequences requiring
inspection, and reduces digital video file sizes. The motion
analysis is then used to estimate the percentage cover of
selected regions within the video transects. The motion
detector can be tuned to detect featureless versus feature-rich
regions, or specific marine fauna or flora.
The fundamental algorithm of the motion detector uses
differences in intensity between consecutive frames. The most
common approach recognises differences in colours based on
thresholds and gains (Cheng et al., 2001; Ohta et al., 1980). A
pixel is detected as a change if the difference between
consecutive frames, multiplied by the gain, exceeds the
threshold. Gains are used to amplify subtle differences and
detect changes that would otherwise be below the threshold.
Specific locations in the colour space of the images are used to
identify the objects of interest. An operator will select these
depending on the feature or species to be detected. The
detected candidate regions are discriminated from noise using a
region size range specified by the operator.
Region growing is subsequently used to either complete the
outlines of candidate features detected with motion analysis, or
can be used to grow the outline of a feature manually selected
by an operator (Adams and Bischof, 1994). The region
growing algorithm can be configured to use colour, colour
statistics and texture, which are the most readily identified
visible signatures of benthic communities and sessile organisms.
Figure 6. Example of a 3D measurement of surface area using a
triangulation mesh.
It is also possible to use stereo-image matching to determine
volumes and surface areas of complex structures such as
animals or physical features. This process is semi-automatic
with the region of interest in one of the images defined initially
by motion analysis processing. Operator selection of key points
followed by epipolar searching and image matching (Gruen and
Baltsavias, 1988) is then used to provide additional 3D
locations within the boundary on the left and right images. The
3D data points are used to define the surface based on a
Delaunay triangulation, from which surface area and volume
can be derived (see figure 6). An accumulation of such
measurements can be used as an estimator of biomass of a
particular features or species of interest within a transect. A
critical factor in the effectiveness and robustness of the
algorithms will be the improvement of image quality and
resolution to be provided by the digital progressive scan
cameras and direct-to-disk system. As can be seen from figures
7 and 8, the image quality from the standard video system and
the general reduction in image contrast caused by attenuation
through the multiple refractive interfaces and water medium is a
limiting factor.
5. APPLICATIONS
The vast majority of deep seabed is not mapped in detail,
although acoustic multi-beam technology and photographic
methods are increasingly providing data for key areas (Kloser et
al. 2007). A primary contribution of video data to multi-scale
surveys of the seabed is the definition of habitat at fine scales.
Video transects can be used to target contrasts in acoustic maps
to validate changes between habitats (see figure 7). Information
on the biological associations with physical components of
habitats enable mapped acoustic data, which has large coverage
and is relatively inexpensive to collect, to be used as a proxy for
the distribution of biodiversity (Kloser et al., 2007). Based on
analysis of the video sequences, abundance measures such as
density or cover can be made at a variety of scales of biological
resolution, and can be related to habitat types at a variety of
spatial scales (Williams et al., 2007). A key step in the use of
image data in deep water habitat mapping is the move from
qualitative to quantitative applications.
Figure 7. Fine scale habitat identification by video within
terrains defined by multi-beam acoustics.
The non-extractive nature of video sampling gives it a
significant advantage over conventional physical sampling with
an epibenthic sled or trawl, particularly for monitoring. While
biodiversity mapping relies on initial physical collection to
provide an inventory of fauna, sensitive environments such as
seamount coral communities (figure 8) benefit greatly from
subsequent monitoring that is non-extractive, especially in
conservation areas. Video surveys will never replicate the
species-level resolution possible from collections of benthic
fauna, but it is often possible to capture data for distinctive
species.
Where species have strong habitat associations and habitats
have high spatial heterogeneity at scales of tens to hundreds of
metres, video sampling will also provide more robust measures
of abundance because the data are continuous and do not
integrate across habitats. In contrast, samples from mobile
collecting devices such as sleds or trawls do integrate across
habitats, mixing the fauna and adding considerable uncertainty