Full text: Proceedings, XXth congress (Part 3)

  
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004 
  
A number of pattern recognition applications shows some 
distortion (due to noise). Also, patterns of a certain class are 
more frequent than others inside the same class. Quite the same 
as for natural languages, ambiguity should be kept into account: 
a sentence may actually bear different meanings, due to possible 
differences at recognition or interpretative level. 
2. PARSING STRATEGIES FOR PATTERN 
RECOGNITION 
2.1 The Theory of Formal Languages 
A statistic approach to language started as the scholars realized 
the value of statistical methods in the recognition of speech. 
Hidden Markov's model was first used by the Author for 
modelling the language of *Evgenij Onegin" (Markov, 1913). 
In the theory of formal languages, a language is defined as a set 
of strings: a string is a finite sequence of symbols chosen from 
some finite vocabulary. In natural languages, a string is a 
sentence, and the sentences are sequences of words chosen from 
some vocabulary of possible words. A grammar is defined as a 
four-tuple: 
G-(N,T, P, S) 
where N and T are the non terminal and terminal vocabularies 
of G, P is the set of production rules, and S is the start symbol. 
A formal language is indeed defined by: 
= A terminal vocabulary of symbols (the words of the 
natural language) 
* A non terminal vocabulary of symbols (the syntactic 
categories, e.g. noun, verb) 
= A set of productions (the phrase structure rules of the 
language) 
=» The so called start symbol 
We start from a special non terminal S, and S is replaced by the 
string on the right side of a chosen production rule. The process 
of rewriting a substring according to one of the rewriting 
production rules continues until the string consists only of 
terminal symbols: 
S —aS|bS|e 
where the symbol | indicates “or” and € is the null string. The 
succession of strings that result from the process is a derivation 
from the grammar: to find a derivation (a parse) for a given 
sentence (sequence) is called parsing. 
As to the latter approach, let's remind that in the Theory of 
Formal languages, Chomsky divides languages in classes, thus 
forming a hierarchy of languages, based upon different 
grammars: 
=»  Unrestricted grammars (0-type grammars) 
"  Context-sensitive grammars (1-type grammars) 
=  Context-free grammars (2-type grammars) 
» Regular grammars or finite state grammars (3-type 
grammars) 
The most general grammar is obviously the 0-type, which bears 
no limits for rewriting rules: for the other types, such 
restrictions are regularly increasing. Types 0 and 1 are able to 
describe natural languages as the other two, much simpler to 
manage from a computational viewpoint, are more suited into 
limited backgrounds and have been hitherto used for artificial 
languages. 
684 
In the 1-type grammar, rewriting rules restraints bear that the 
right side of a rule should have at least as many symbols as the 
left side. 
For the 2-type grammar, all rules should have just one non- 
terminal symbol on the left side 
For 3-type grammar, the right side has only one terminal 
symbol, or one terminal symbol and a non-terminal one for 
every production rule. 
An attraction of the use of a syntactic method for Pattern 
Recognition is the possibility of description and recognition of 
an object, a scene, a texture: but noise may complicate the 
process of computing the string structure: extraneous primitives 
may be generated by spurious edges, or actual primitives of 
some shape may be undetected due to the poor quality of the 
image. 
Another problem is the ambiguity of the primitives that are 
extracted to represent the patterns, how this ambiguity can be 
accounted for in the model or in the analysis (in the 
construction of the grammar or in the parsing). 
A grammar normally divides strings in just two classes: the 
grammatically correct ones, and the others. In any case, 
ambiguity is very frequent, and it is even deliberately pursued 
as sometimes happens in poetry. 
The following different approaches have been proposed: 
" approximation 
= transformational grammars 
" stochastic grammars 
= similarity and error-correcting parsing 
The use of approximation reduces the effect of noise and 
distortion at the preprocessing and primitive extraction stage. 
The second approach defines the relations between the noisy 
pattern and the corresponding noise-free pattern by a 
transformational grammar. 
Many efforts have been devoted to construct more sophisticated 
grammars, like stochastic and fuzzy grammars. Although 
context free grammars or transformational grammars can 
represent the phrase structure of a language, they tell nothing 
about the relative frequency or likelihood of a given sentence. It 
is usual in context free grammar to use recursive productions to 
represent repetition, however one can generate sentences which 
are technically grammatical, but not always acceptable. 
Stochastic approach supplies a solution to this problem: each 
production in a stochastic grammar is assigned a probability of 
selection, a number between zero and one. During the 
derivation process, rules are selected for rewriting according to 
their assigned probabilities. Each string has a probability of 
occurrence computed as the product of the probabilities of the 
rules in its derivation. 
When a string has two or more parses, we can use the more 
probable parse as a description of the string (pattern): the most 
probable parse is the one according to which the generated 
string has the highest probability. 
However, what we already know about probable occurrence 
plays a meaningful role. The parsers are made up according to a 
likelihood criterion. However, parsers may also be built 
according to a further criterion, i.e. the Bayes’s theorem. 
In this case, some utter a priori information is required about 
the starting probability to deal with one class of patterns or 
another one. 
When a distorted pattern cannot be accepted by any grammar, 
an error-correcting parsing, based upon a similarity criterion, 
can be used: a way to handle noisy patterns is the use of 
similarity measures instead of stochastic grammars: a similarity 
or distance measure is defined between a sentence representing 
a known pattern and sentence representing an unknown pattern. 
Interna 
The dis 
of error 
2.2 Pa 
In acco 
number 
Parsing 
starts fi 
termina 
starts fr 
Besides 
related : 
0-type | 
1-type | 
2-type | 
3-type 1 
Chart al 
althougi 
In the 
gramma 
algorith: 
between 
Models 
Kasami’ 
are trans 
To get ( 
all non 
terminal 
A > € 
A-—E 
Be w- a 
to be tes 
The alg 
element: 
a value | 
inside t; 
A — a, 
The tabl 
The strir 
be found 
Stochast 
Models. 
A Hidde 
series oi 
inference 
A Hidde 
paramete 
One can «
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.