International Archives of Photogrammetry and Remote Sensing, Vol. 32, Part 7-4-3 W6, Valladolid, Spain, 3-4 June, 1999
more different phenomena are described and need to be
explained.
2.3. Dataset
To illustrate the main steps of the proposed approach, we use
a dataset collected over Ocean City, Maryland on April 25
and 30, 1997 (http://polestar.mps.ohio-
state.edu/~csatho/wg35.htm). Ocean City is located along a
narrow peninsula by sandy beaches on the east, and harbors
and docks on the west coast. High-rise buildings on the east
and residential areas on the west side flank the main road. The
dataset comprises of aerial photography, multispectral scanner
imagery, and scanning laser data. As a part of the laser
altimetry data, precise GPS positions and INS attitude have
also been recorded. Csatho et al. (1998) describe the dataset
in more detail.
Digital elevation data were acquired by the Airborne
Topographic Mapper (ATM) laser system. The ATM is a
conical scanning laser altimeter developed by NASA to
measure the surface elevation of ice sheets and other natural
surfaces, such as beaches, with ca. 10 cm accuracy (Krabill et
al., 1995). The multispectral data were collected by the
Daedalus AADS-1260 airborne multispectral scanner from
the National Geodetic Survey (NGS). The AADS-1260 is a
multispectral line scanner with eleven spectral bands in the
visible, near infrared and thermal infrared providing 1.8-2.5
m pixels on the ground. Aerial photography was acquired
with an RC20 camera, also from NGS. The laser scanner and
the multispectral scanner were mounted on NASA's P-3B
aircraft. The aerial camera was operated independently by
NGS, but on the same day.
The original data have been preprocessed. For example, the
GPS, INS, and laser range data were converted into surface
elevations of laser footprints. The aerial photographs were
scanned with a pixel size of 28 microns and an aerial
triangulation with control points established by GPS provided
the exterior orientation parameters. However, due to the lack
of instrument calibration data, we could not transform the
multispectral dataset into ground reflectance to establish a
match with library spectra.
3. PROPOSED SYSTEM FOR MULTSENSOR OBJECT
RECOGNITION
Figure 1 depicts a schematic diagram of a data fusion
architecture that is tailored for combining aerial and
multispectral imagery, and laser scanning data for object
recognition in urban scenes. It includes modules that are
considered important in model based object recognition.
The overall goal is to extract and group those features from
the sensory input data that lead to a rich data model. This, in
turn, permits modeling the objects with additional attributes
and relationships so that the differences between data and
object model become smaller, resulting in a more stable and
robust recognition.
aerial laser multispectral
K J I H GFEDCBA
Fig. 1. Combination of sensor fusion and object recognition.
A common approach to fuse multispectral imagery and laser
scanning data is to convert the latter into a range image, to
add it as a separate channel to the multispectral data and to
classify this combined set of data. This elegant approach is
based on the assumption that the classes correspond to
objects. Our approach is radically different, as we are quite
skeptical on whether object recognition can be satisfactorily
solved by classification. Rather, we advocate extracting
features from the individual sensors and fuse them at
appropriate levels, following the general rules described in
the previous section.
It has long been realized that surfaces play an important role
in object recognition. However, to be useful, surface
information must be explicit and suitably represented. This
entails extracting surface features from laser scanning data
and aerial imagery followed by fusing them, leading to the