Stefan Hinz
features at the lowest level of the road model, i.e., road markings and lanes are extracted separately in each fine scale
image. After a first validation, which includes also the detection of vehicle, the results of the different images are fused
into larger road segments (possibly consisting of several lanes). During the next step junctions and connection hypotheses
between the road segments are generated and validated in each image. The results are fused again, and the next iteration
starts. The extraction is completed if no more connections can be formed or verified.
The next section describes existing approaches on road extraction being most relevant to our work. After a short discussion
of the appearance of roads in urban areas, our road model is introduced in Sect. 3. In Sect. 4 the extraction strategy is
outlined in detail and results of the currently implemented modules are shown. Finally, the results are discussed and
conclusions for our future work are drawn (Sect. 5).
2 RELATED WORK
. Compared to the intense research activities on the extraction and visualization of buildings, site models, or city models
(see (Mayer, 1999) and (Forstner, 1999) for an overview), only few groups work on the automatic extraction of roads
in urban environments. One reason for this is that most of the past and actual efforts in road extraction, including our
own previous work, rely on road models that describe the appearance of roads in rural terrain. Depending on the image
resolution, roads are modeled as line-like structures (Wiedemann and Hinz, 1999, Heller et al., 1998, Gruen and Li,
1997, Geman and Jedynak, 1996) or relatively homogeneous areas satisfying certain shape and size features (Harvey,
1999, Zhang and Baltsavias, 1999, Baumgartner et al., 1999, Mayer et al., 1998, Trinder and Wang, 1998). Throughout
all the different approaches some issues have proved to be of great importance: The fusion of different scales helps to
eliminate isolated disturbances on the road while the fundamental structures are emphasized (Mayer and Steger, 1998).
Exploiting the network characteristics adds global information and, thus, the selection of the correct hypotheses becomes
easier (Wiedemann, 1999, Heller et al., 1998). The integration of context helps to cope with the influence of background
objects like trees and buildings on the appearance of roads. In the following, we discuss three selected approaches, each
consisting of certain promising elements, and each having significant influence on our concept:
(Vosselman and de Gunst, 1997) and (de Gunst, 1996) use a detailed and scale dependent road model for the actualization
of outdated road maps. According to road construction guidelines, they model roads and freeways as aggregation of dif-
ferent lanes. Bright markings separate the individual lanes. Intersecting roads form junctions of different types (crossings,
Y-junctions, and fly-overs). In order to detect changes in the road network, in a first step, the old information is validated
using road features extracted from medium or small scale aerial images (about 0.4 m / 1.6 m resolution). Then, in case
of inconsistencies, a change in the road width, e.g. an additional lane, or a junction where a new road branches off is hy-
pothesized. Again, extracted road features verify the hypotheses. With this strategy it is possible to detect both changes in
existing roads and completely new roads. They use, however, rather simple thresholding and profile matching techniques
for the extraction of road features. The approach is consequently very sensitive for disturbances like cars, shadows or
occlusions. Hence, this system seems only applicable to roads and freeways in open and rural areas, though the concept
is more generic.
In the approach of (Ruskoné, 1996, Ruskoné et al., 1994, Airault et al., 1994), the interpretation of local context is used
to verify an automatically extracted road network. The extraction starts with detecting seed points in a medium (about
0.5 m resolution) scale image followed by low level road tracking along homogeneous elongated areas. Then, hypotheses
for the connection of the extracted road parts are generated and checked based on geometric criteria like distance and
direction. The resulting road network is geometrically adjusted using snakes. For verification, the network is split into
smaller pieces. A supervised classification assigns the meanings "road", "crossing", "shadow", "tree", or "field" to each
piece. These so-called local contexts are validated using several geometric and radiometric criteria. In urban areas, where
techniques based on road profiles or road sides may yield erratic results, the validation is done by detecting cars. With a
neural network classifier car-like patterns are extracted and thereafter grouped into convoys (Ruskoné et al., 1996).
The approach of (Price, 1999) is particularly designed for extraction of urban street grids from medium and high resolution
images (about 0.8 m - 0.2 m resolution). The road network is modeled as a combination of grids with a rather regular mesh
size, the size of a single building block. Junctions — the nodes on the grid — are connected by individual road segments
of approximately constant width and height. A human operator initializes the grid spacing and orientation by manual
selection of three points specifying the first mesh. Then, the grid is iteratively expanded by adding new meshes. In each
iteration, the new road segments are refined and evaluated by simultaneously matching their sides to image edges. Thus,
longer portions of the road sides must be visible at least in one of different overlapping images. During final verification,
height information and contextual knowledge are used to adjust the position of several consecutive road segments and to
remove short road portions. To do this, the adjusted segments are evaluated anew but now regarding each segment's direct
neighbors.
The above approaches show individually promising parts of road model and extraction strategy for different types of
scenes. The varying appearance of roads in complex scenes can be captured by a detailed model for a road and its
406 International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000.