the vast majority of road signs. The group of relevant road signs
did also not include road signs for cross-roads which were not
facing the mapped roads. The designed algorithms were
performed in the distance range from 4 to 14 m. The first test
with winter imagery yielded an automatic detection of 89% of
all relevant road signs and a correct automatic classification of
82% (see Table 1). Based on the summer imagery, 91% of all
road signs could automatically be detected, and 89% could be
classified correctly. With an additional user-supported step, the
classification accuracy could be increased by another 5%. Due
to this user-supported approach and some further built-in
constraints, there were hardly any false positives.
There are different reasons for an incorrect detection or
classification of road signs. The detection process yields many
red segments close to construction areas due to safety fences
and warning devices. If the areas of these color segments are not
too big, they can lead to false positives. Although the road signs
in Switzerland generally appear in good condition, a few of
them are yellowed. Thus, there are very low values for the
saturation component. The same is also the case for road signs
which are located in shadows. Since the defined threshold for
this component cannot be exceeded, the detection of such road
signs is not possible. In addition, there are some difficulties to
automatically detect road signs if the depth maps are poor or
incomplete. For several road signs, there exists no predefined
template which leads to no or a wrong classification. A
suboptimal threshold for the search image binarization can
cause a too low correlation coefficient.
Quan- | Detec- | Classi- False
tity tion fication | Positives
Winter all 152 55% 47% 2
relevant 65 89% 82%
summer all 96 71% 64% 4
relevant 46 91% 89%
Table 1. Detection and classification quality of the developed
algorithms for two test campaigns
For the evaluation of the geometric accuracy, 3D positions for
22 reference road signs were determined using precise
tachymetric observations. For the first test campaign, the
differences between the 3D positions which were automatically
derived by the described algorithms and the reference positions
were computed. The maximal residual for a component is 16
cm; however, most differences are in the range of 5 cm (see
Table 2). For the empirical standard deviation of the 3D
position difference, a value of 9.5 cm was calculated.
in mm Aacross Aalong Aheight A3D
Mean -36 23 -36 86
Maximum 152 146 157 159
Maifr 46 64 53 95
Table 2. Mapping accuracy of the developed algorithms for a
winter test campaign
5. CONCLUSIONS AND OUTLOOK
The investigations demonstrate the potential, in terms of
automation and accuracy, offered by stereovision-based mobile
mapping, if dense depth information is exploited.
Approximately 90% of the relevant road signs with
predominantly red, blue and yellow colors in Switzerland can
be detected, and 85% can be classified correctly. By means of a
user-supported approach (Cavegn & Nebiker 2012), these rates
can be increased by another 5%. Therefore, only 5 to 10% of
the road signs have to be digitized either interactively in the
stereo imagery or on site. Moreover, due to various constraints
built into the algorithms, there are hardly any false positives.
The presented approach is robust in terms of scaling,
translations and small rotations. Although it is expected to
obtain better results with nearby road signs, they can be
detected in the whole predefined distance range interval. Road
signs can arbitrarily be positioned in the image and small
rotations are tolerated. Furthermore, it is possible to detect
multiple road signs in the same image appearing in the shapes
circle, rectangle, square, triangle and diamond.
Not only depth maps of good quality but also sufficient color
segmentation is crucial for the detection success. For this
purpose, appropriate thresholds have to be applied. For the
presented investigations, the interval for each component was
chosen to be quite large. However, this was only possible since
the search space could significantly be reduced due to depth
information and false positives could be rejected using certain
built-in constraints.
Since a detection and classification quality of 100% is unlikely,
it is possible to overlay the automatically mapped road signs in
a georeferenced 3D video. The 3D videos can be viewed with a
stereovision client (e.g. Burkhard et al. 2011), the results
visually verified and the missing road signs quickly digitized. In
the future, a first implementation of the algorithms for white
and gray road signs which uses the depth information in
combination with the Hough transform (Cavegn & Nebiker
2012) will further be improved. The detection of other complex
road signs and the identification of text (Wu et al. 2005) are
also planned. An increase of the geometric accuracy and
reliability could be achieved by matching in stereo image
sequences (Huber et al. 2011). Tracking of road signs over
multiple stereo image pairs would particularly effect an
enhancement of the semantic quality.
The goal of related work in progress is to determine the impact
of different camera resolutions on the detection and
classification quality. First investigations with a stereo system
composed of industry cameras with a higher resolution of
eleven megapixels show a slight improvement of the results. For
the identification of text, the higher geometric resolution is
mandatory. Current investigations also show that the depth map
quality can significantly be increased using both image sensors
with a higher resolution and adequate radiometric adjustments,
which again positively affect the automated road sign mapping.
62
co rm M m] MJ "M CV m L1 Mmm
Am N m
MN eA S