Full text: CMRT09

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009 
4 FILTERING 
Once the image is segmented, the system must be able to 
select which regions contain text (letters) and which do 
not. A part of these regions is obviously non text (too 
big/too small regions, too large...). The aim of this step is 
to dismiss most of these obviously non text regions with 
out loosing any good character. A small collection of fast 
filter (criteria opening) eliminate some regions with sim 
ple geometric criteria (based on area, width and height). 
These simple filters help saving time because they rapidly 
eliminate many regions, simplifying the rest of the process 
(which is a bit slower). 
5 PATTERN CLASSIFICATION 
Some segmented regions are dismissed by previous filters 
but a lot of false positives remain. To go further, we use 
classifiers with suitable descriptors. 
Due to the variability of analysed regions, descriptors must 
(at least) be invariant to rotation and scale. The size and the 
variability of examples in training database ensure to be in 
variant to perspective deformations. We have tested a lot of 
different shape descriptors (such as Hu moments, Fourier 
moments...). Among them, we have selected two families 
of moments : Fourier moments and the pseudo zemike mo 
ments. We select them empirically as during our test, they 
get a better discrimination ratio than others. We choose 
also to work with a third family of descriptors: polar repre 
sentation is known to be efficient (Szumilas, 2008) but the 
way this representation is used does not match our need. 
Then we define our own polar descriptors: the analysed re 
gion is expressed into polar coordinate space centered into 
the gravity center (Figure 6). The feature is then mapped 
into a normalized rectangle (the representation is then in 
variant in scale factor). To be rotation invariant, many peo 
ple use this representation by computing a horizontal his 
togram within this rectangle but this leads to a loss of too 
much information. Another way to be rotation invariant 
if the representation used is not rotation invariant is to re 
define the distance computed between samples (Szumilas, 
2008). But this leads to a higher complexity. To be rota 
tion invariant, we simply take the spectrum magnitude of 
Fourier transform of each line in the normalized rectan 
gle. These results carry much more information than sim 
ple histograms, and are easier than changing the distance 
used. 
Once we choose the descriptors, we train a svm classi 
fier (Cortes and Vapnik, 1995) for each family of descrip 
tors. To give a final decision, all outputs of svm classifier 
are processed by a third svm classifier (Figure 7). We tried 
to add more classifiers in the first step of the configuration 
(with other kinds of descriptors) but this makes the overall 
accuracy systematically decreasing. 
GROUPING 
angle 
Figure 6: The region is expressed in a polar coordinate 
space and to have a rotation invariant descriptor we take 
the spectrum of Fourier transform of every line. 
pzm^ 
SVM 
fourier 
► 
SVM 
►1 SVM H fina L 
° decision 
polar 
► 
SVM 
Figure 7: Our classifier is composed of 3 svm classifiers 
that use common family of descriptors and a svm that take 
the final decision. 
are grouped all together with neighbour to recover text re 
gions. The conditions to link two characters to each other 
are the one given in (Retomaz and Marcotegui, 2007). They 
are based on the distance between the two regions rela 
tively to their height. This steps will soon be improved 
to handle text in every direction as this approach is re 
stricted to nearly horizontal text. During this process, iso 
lated text regions (single character of couple of letters) 
are dismissed. This aggregation is mandatory to generate 
words and sentences to integrate as an input in an O.C.R. 
but it also suppresses a lot of false positive detections. 
7 LETTER DETECTION EXPERIMENTS 
In this section, we evaluate segmentation and classification 
steps. 
Segmentation The segmentation evaluation is always dif 
ficult as it is, for a part, subjective. Most of time, it is 
impossible to have a ground truth to be used with a repre 
sentative measure. To evaluate segmentation as objectively 
as possible for our application, we have constituted a test 
image database by randomly taking a subset of the image 
database provided by I.G.N. (Institut Géographique Na 
tional, n.d.) to the project (¡Towns ANR project, 2008). We 
segment all images from this database and we count prop 
erly segmented characters. We define as clearly as possi 
ble what properly segmented means: the character must be 
readable, it must not be split or linked with other features 
around it. The thickness may vary a little provided that its 
shape remains correct. We compare the result with 3 other 
segmentation methods: 
We are able to analyse main regions in the image and ex 
tract characters. Once these characters are selected, they
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.