Full text: CMRT09

CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation 
evaluates a threshold T(x) for a given pixel x, accord 
ing to its neighborhood by: 
T(x) = m(x) + ks(x) (6) 
with m and s the mean and the standard deviation 
computed on the neighborhood and fceRa parame 
ter. 
• Sauvola binarization criterion (Sauvola et al., 1997) 
which evaluates a threshold T(x) by: 
T{x) — m(x) ^1 + fc _ 0 ) ^ 
with R the dynamic of standard deviation s(x). 
• the segmentation exposed by Retomaz (Retomaz and 
Marcotegui, 2007) based on the ultimate opening. This 
operator, introduced by Beucher (Beucher, 2007), is 
a non-parametric morphological operator that high 
lights the most contrasted areas in an image. 
The evaluation image database contains 501 characters. The 
results of each method are given in the following table: 
% of properly segmented characters 
Niblack 
73,85 
Sauvola 
71,26 
TMMS 
74,85 
Ultimate Opening 
48,10 
Our method gives the best results. Thresholding with Sauvola 
criterion is far less efficient on average. It fails frequently 
on text correctly handled with Nilback criterion or our method 
but, in some situations, it gives the best quality segmenta 
tion. The overall poor result is explained by the high diffi 
culty level of the environment. The ultimate opening sur 
prisingly gives bad results. This may come from the fact 
that images are taken by sensors mounted on a moving car: 
images may have a motion blur, which makes the ultimate 
opening fail. We then cancel it from the comparison. 
The other aspect of our comparison is speed. We evaluate 
all methods on the set of images and compute mean times. 
Times are given in seconds for 1920x1080 image size and 
according to the size of the mask of every method: 
Mask size 
3x3 
5x5 
7x7 
9x9 
11x11 
Niblack 
0,16 
0,22 
0,33 
0,47 
0,64 
Sauvola 
0,16 
0,23 
0,33 
0,47 
0,64 
TMMS 
0,11 
0,18 
0,27 
0,44 
0,55 
All implementations are performed according to the defi 
nition without any optimization. Our method always gets 
the best execution times (Notice that Shafait et al. (Shafait 
et al., 2008) have recently offered a faster way to compute 
Sauvola criterion). 
The speed of the algorithm is important but the output is 
also a major aspect as execution time of a complete scheme 
usually depends on the number of regions provided by seg 
mentation steps. On our database, on average, binarization 
Figure 8: Examples of text and non text samples in learn 
ing database. 
with Niblack criterion generates 65177 regions, binariza 
tion with Sauvola criterion generates 43075 regions, our 
method generates 28992 regions. Reducing the number 
of regions in the output may save time when we process 
these regions. The possibility, in our method, to set up 
the lowest allowed contrast prevents from having over seg 
mented regions. Moreover, many of these regions, noticed 
as homogeneous, can be associated with other neighbour 
regions (end of section 3). This simple process may lead 
to a decrease in the number of regions. This low number 
of regions may increase the localisation precision as it can 
decrease false positives. It is another proof that the seg 
mentation provided by our method is more relevant. 
Letter Classification To perform training and testing we 
have constituted (Fig. 8): 
• a training data base composed of 32400 examples with 
16200 characters from various sources (letters at dif 
ferent scales/points of view...) and 16200 other re 
gions extracted from various urban images and, 
• a testing base with 3600 examples. 
Notice that all training are performed by tools provided 
by (Joachims, n.d.). 
Different configurations of classifiers have been tested to 
get the highest classification accuracy. With the configura 
tion we have chosen (Figure 7), the svm classifier trained 
with pseudo Zemike moments gives 75.89% of accuracy, 
the svm classifier trained with our polar descriptors gives 
81,50% of accuracy and last svm classifier trained with 
Fourier descriptors gives 83,14% of accuracy. This proves 
that our descriptor is well defined as its accuracy is at the 
same level of accuracy as Fourier descriptors and pseudo 
Zemike moments. 
To make the final decision we choose a late fusion archi 
tecture. Different tests are performed: from a simple vote 
of the three previous classifiers to the use of another classi 
fier. The best result has been reached by the use of a SVM 
classifier which gets, 87,83% of accuracy with the confu 
sion matrix : 
% 
Letter 
Background 
Letter 
Background 
91,56 
15,89 
8,44 
84,11 
The unbalanced result is interesting for us, as the most im 
portant for us is not to lose a character. 
202
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.