CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
evaluates a threshold T(x) for a given pixel x, accord
ing to its neighborhood by:
T(x) = m(x) + ks(x) (6)
with m and s the mean and the standard deviation
computed on the neighborhood and fceRa parame
ter.
• Sauvola binarization criterion (Sauvola et al., 1997)
which evaluates a threshold T(x) by:
T{x) — m(x) ^1 + fc _ 0 ) ^
with R the dynamic of standard deviation s(x).
• the segmentation exposed by Retomaz (Retomaz and
Marcotegui, 2007) based on the ultimate opening. This
operator, introduced by Beucher (Beucher, 2007), is
a non-parametric morphological operator that high
lights the most contrasted areas in an image.
The evaluation image database contains 501 characters. The
results of each method are given in the following table:
% of properly segmented characters
Niblack
73,85
Sauvola
71,26
TMMS
74,85
Ultimate Opening
48,10
Our method gives the best results. Thresholding with Sauvola
criterion is far less efficient on average. It fails frequently
on text correctly handled with Nilback criterion or our method
but, in some situations, it gives the best quality segmenta
tion. The overall poor result is explained by the high diffi
culty level of the environment. The ultimate opening sur
prisingly gives bad results. This may come from the fact
that images are taken by sensors mounted on a moving car:
images may have a motion blur, which makes the ultimate
opening fail. We then cancel it from the comparison.
The other aspect of our comparison is speed. We evaluate
all methods on the set of images and compute mean times.
Times are given in seconds for 1920x1080 image size and
according to the size of the mask of every method:
Mask size
3x3
5x5
7x7
9x9
11x11
Niblack
0,16
0,22
0,33
0,47
0,64
Sauvola
0,16
0,23
0,33
0,47
0,64
TMMS
0,11
0,18
0,27
0,44
0,55
All implementations are performed according to the defi
nition without any optimization. Our method always gets
the best execution times (Notice that Shafait et al. (Shafait
et al., 2008) have recently offered a faster way to compute
Sauvola criterion).
The speed of the algorithm is important but the output is
also a major aspect as execution time of a complete scheme
usually depends on the number of regions provided by seg
mentation steps. On our database, on average, binarization
Figure 8: Examples of text and non text samples in learn
ing database.
with Niblack criterion generates 65177 regions, binariza
tion with Sauvola criterion generates 43075 regions, our
method generates 28992 regions. Reducing the number
of regions in the output may save time when we process
these regions. The possibility, in our method, to set up
the lowest allowed contrast prevents from having over seg
mented regions. Moreover, many of these regions, noticed
as homogeneous, can be associated with other neighbour
regions (end of section 3). This simple process may lead
to a decrease in the number of regions. This low number
of regions may increase the localisation precision as it can
decrease false positives. It is another proof that the seg
mentation provided by our method is more relevant.
Letter Classification To perform training and testing we
have constituted (Fig. 8):
• a training data base composed of 32400 examples with
16200 characters from various sources (letters at dif
ferent scales/points of view...) and 16200 other re
gions extracted from various urban images and,
• a testing base with 3600 examples.
Notice that all training are performed by tools provided
by (Joachims, n.d.).
Different configurations of classifiers have been tested to
get the highest classification accuracy. With the configura
tion we have chosen (Figure 7), the svm classifier trained
with pseudo Zemike moments gives 75.89% of accuracy,
the svm classifier trained with our polar descriptors gives
81,50% of accuracy and last svm classifier trained with
Fourier descriptors gives 83,14% of accuracy. This proves
that our descriptor is well defined as its accuracy is at the
same level of accuracy as Fourier descriptors and pseudo
Zemike moments.
To make the final decision we choose a late fusion archi
tecture. Different tests are performed: from a simple vote
of the three previous classifiers to the use of another classi
fier. The best result has been reached by the use of a SVM
classifier which gets, 87,83% of accuracy with the confu
sion matrix :
%
Letter
Background
Letter
Background
91,56
15,89
8,44
84,11
The unbalanced result is interesting for us, as the most im
portant for us is not to lose a character.
202