In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009
Figure 9: The system localizes correctly text in the image
(even with rotated text) but it detects aligned windows as
text.
Figure 10: Text is correctly localized, but the classification
step fails on the end of the word courcmt in red and zebra
crossing sign is seen as text.
We also test different combinations of classifiers and de
scriptors. When we try early fusion architecture, we give
all descriptors to a unique svm classifier ; the result does
not even reach 74% of accuracy. On the contrary, if we
add a collection of simple geometric descriptors (compac-
ity, surface, concavity...) to the svm classifier that must
take the final decision in our architecture, the overall ac
curacy reaches 88,83%. These measures seem to help the
classifier to select which classifiers are the most reliable
depending on the situation.
The overall accuracy seems to be a bit low but the vari
ability of text in our context is so huge that the real perfor
mance of the system is not so bad.
8 TEXT LOCALIZATION IN CITY SCENES
Let us see the application of the complete scheme. We took
an initial image (Figure 12). The application of our algo
rithm of segmentation gives the result in figure 13. All re
gions with a reasonable size are kept, others are dismissed
(Figure 14). The classifier selects text regions among re
maining regions (Figure 15). Text regions are grouped to
create words and sentences (Figure 16).
The system is efficient: instead of a variation of orienta
tion, police and lighting condition, the system handles ma
jority of text (Figure 9, 10 et 11). But it also generates
many false positives: especially aligned windows (Figure 9
top right and Figure 11). Other results can be seen in fig
ures 9 and 10. The system must then be improved to reduce
false positives.
ALIMENTATION GENERALE
»0143720884
Figure 11: Various texts are correctly handled but periodi
cal features are also interpreted as text.
9 CONCLUSION
We have presented a text localization process defined to
be efficient in the difficult context of the urban environ
ment. We use a combination of an efficient segmentation
process based on morphological operator and a configu
ration of svm classifiers with various descriptors to deter
mine regions that are text or not. The system is competi
tive but generates many false positives. We are currently
working to enhance this system (and reducing false posi
tives) by improving the last two steps: we keep on testing
various configurations of classifiers (and selecting kernels
of svm classifiers) to increase the accuracy of the classi
fier and we are especially working on a variable selection
algorithm. We are also working on the grouping step of
neighbour text regions and its correction to send properly
extracted text to O.C.R.
ACKNOWLEDGEMENTS
We are grateful for support from the French Research Na
tional Agency (A.N.R.)
REFERENCES
Arth, C., Limberger, F. and Bischof, H., 2007. Real-time license
plate recognition on an embedded DSP-platform. IEEE Interna
tional Conference on Computer Vision and Pattern Recognition
(CVPR ’07) pp. 1-8.
203