Paolo Gamba
The algorithms presented in this paper were originally applied to the IFSAR DSM of a part of the Los Angeles urban
area composed of isolated financial buildings (Gamba and Houshmand, 1999), and discarding the problems due to
shadowing/layover effects. In this paper we focus on a more densely built area, with trees, roads and a more complex
topography. Moreover, we use LIDAR measurements, that provide a more accurate representation of the urban three
dimensional geometry and a urban Digital Surface Model with cm level accuracy.
2 THE DSM OBJECT EXTRACTION ALGORITHM
No question that one of the open problems of remote sensing data interpretation, especially when dealing with 3D (also
called range) images is how to segment the data into significant parts. This is usually accomplished by looking inside
the same data for some kind of common features (classification) or by comparing data subsets with known structure
models for the area (object recognition). The dilemma is between a rapid recognition process (that usually discards
minor details and retrieve simple geometrical characteristics) and precise classification (when every pixel is assigned to
a cluster).
However, different kinds of data often correspond to different problems (for instance, noise sources) and different
retrieval algorithm. In urban areas this is especially true, because of the large interaction among the geometric,
dielectric and topographic characteristics of the objects involved. In this situation it is clear that best results can be
obtained only by using additional information, like those already introduced in a GIS of the same area or in an other
kind of data (Haala, 1999). Anyway, even in large populated areas it is difficult to have these data readily achieved, or
even available. That's why there is a number of approaches dealing with fusion of data coming from different source,
possibly mounted on the same platform.
In this work we want to introduce an algorithm based on machine vision techniques for surface retrieval and detection,
based only on the (a priori known) geometrical properties of the structures to be identified. With such an approach,
different sources of imprecision and noise may be treated in the same way, as errors in the range data acquisition. Even
if we do not exploit the noise characteristics to discard it, we delete its effects on the geometrical structures by
recovering from the distortion that it introduces. For instance, this algorithm may be very interesting to enhance the 3D
geometry retrievable from interferometric SAR measurements. In this case, the so called /ayover and shadowing effects,
due to multiple bounces of the electromagnetic backscattered pulse, usually produce relevant changes in the 3D
structure of the buildings in a urban, crowded environment. Still, the algorithm introduced in Gamba and Houshmand,
1999 and Gamba et al., 2000 succeeds in retrieving to a large extent the main objects in such an environment. In the
following we give a quick outline of the procedure, trying to highlight the advantages of our proposal for LIDAR data
analysis as well as the drawbacks still to be overcome. The algorithm is based on a fast segmentation procedure outlined in
Jiang and Burke, 1994, even if it provides a different way to aggregate range points and has been improved to cope with remote
sensed data.
As a final comment before presenting the algorithm, we should note that the algorithm requires a regular data grid, and works on
lines (or columns of this grid). While this allows a simple and intuitive way to aggregate pixels into segments (the first step of the
procedure), it has all the drawbacks of a discretization process, especially when looking for building footprints. Moreover, LIDAR
data, that are naturally sparse and not regularly equally spaced, must be pre-processed before entering the algorithm.
The data flow may be subdivided into five steps.
1. As noted above, the regular data grid is first subdivided into segments, working by rows or columns. To this aim, the algorithm
introduces two parameter (o, and 05, see Table 1), which rule the process. The first one defines the maximum height difference
that is allowed between successive points to be in the same segments. The second one, instead, provides the inferior limit to the
segment lengths. Both of them must be carefully chosen, in order to avoid excessive segmentation (and more difficult
aggregation in the following steps). Moreover, from the opposite point of view, we should not low-pass too much the data (as
possible with thresholds allowing long segments), because we may loose important information hidden in the data. As usual,
this leads to a compromise, that must even take into account the data resolution and the mean dimension of the objects to be
extracted.
N
Once the segments have been individuated, they are compared two at a time, to detect areas where collinear couples are present.
They provide the seeds of the planar patches that we want to define. Here the parameter ruling the process (05) is the minimum
similarity value that we have to find to allow to segments to be a seed. The choice suggested by the original algorithm in Jiang
and Burke, 1994 is to take the similarity between the i-th and j-th segment as measured by
International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000. 313