2) If the pixel has no classified neighbors, continue with
step 1 for the next pixel.
3) If the pixel has already classified neighbors, the class
of the most frequent neighbor will be assigned.
4) If there are several histogram entries having the same
frequency, the class of the pixel located in the north-
west direction will be assigned.
The steps 1 to 4 are repeated until all image pixels are
classified.
From the classified image mp-1 binary color layers may be
separated. Some of these layers still include textured re-
gions or they have some defects caused by overprinting
with other layers. If for example a tree symbol (black) is
printed over a wood region (light green), the assignment
of the symbol to the black layer will result in an equally
shaped defect in the light green layer. These defects may
be corrected using region growing techniques with a de-
fined set of rules, as for example,
Set a 0-pixel in the light green layer to 1 if it belongs
to a closed 0-region and there is a 1-pixel either in
the black or brown layer.
A textured region like a lake area, which is printed using
blue raster dots, may be filled using structural texture
analysis methods in combination with a texture element
grouping algorithm (Fumiaki et al., 1990). With this step
the separation of color layers is completed.
4.2 Recognition of raster symbols
A rotation and size invariant recognition of separate, not
overlapping raster symbols and objects (e.g. tree symbols,
characters) can be obtained using a neural network based
technique (Lauterbach et al., 1991). The major algorithm
extracts rotation and size invariant feature vectors based
on polar distance measures. Several types of these meas-
ures may be combined for the classification of a single
raster symbol or object, for example
* the distance from the center of gravity (CG) of the
raster object to its outmost border,
- the distance from the CG to the change of first pixel
value,
* the sum of the raster object pixels counted from the
CG.
All these measurements are determined for a predefined
number of directions depending on the object size. The
direction for the polar measurements starts from the main
axis of inertia of the object, using additional contour or
diameter measurements that are necessary to distinguish
between an object rotation of +.
The feature vectors are evaluated using a hierarchical
structure of multi-layer perceptrons. There is one percep-
tron for the direct evaluation of each feature vector (stage-
1 network). The number of network inputs corresponds to
the vector size, the number of outputs corresponds to the
size of the object set. The outputs of the stage-1 networks
are combined using the following equation
ISmin
Ism ”
1
Omax
nf
on= Y imnWm, with wm= (14)
m=l
where n is the index of the output or input unit, m is the
index of the stage-1 network and nf is the overall number
of the stage-1 networks. omax is the output with the maxi-
mum activity. wm is a weight factor, Ism is the number of
learning steps necessary to train the stage-1 network m and
ISmin is the minimum number of learning steps that has
occured.
The output of this combination stage is fed into a further
perceptron, which makes the final decision about the raster
object classification. After recognizing a raster object, it
is deleted from the layer and the recognition result is put
into a temporary data base, where it is available for further
interpretation.
4.3 Separation of region-based and line-based layers
The region data and the line data included in a layer has to
be processed in different ways. Region data must be con-
tourized while line data must be vectorized. Thus, for
every layer it is necessary to detect whether it contains
mainly region or line structures.
This task is performed using a distance histogram based on
a medial axis transformation (Pavlidis, 1987). The histo-
gram values Di are calculated using the equation
n m
Di-Y, Y d(fmea(pj 6.) - ) (15)
y=1x=1
; _j1 for x=0
vim ioi otherwise '
where m and n are the image dimensions and pj(x,y) is the
value of the pixel represented by the coordinates x and y
in the layer j. Function fmed yields the minimum distance
of the pixel at position (x,y) from the raster object border.
The histogram of a line-based layer has a tall shape
whereas a region-based layer yields a wide histogram.
4.4 Vectorization
Vectorization is performed on one pixel wide line struc-
tured images. Therefore, the region-based layers have to
be contourized. This is done using a contour tracing algo-
rithm described in (Pavlidis, 1987). The line-based layers
have to be thinned before vectorizing them. Most line
thinning algorithms are critical to use, because they pro-
duce a number of short line fragments connected to the
skeleton which do not really exist in the line image. There-
fore, we use an algorithm which is not very fast but pro-
duces a clean medial line of the raster objects in the input
image. This algorithm is based on a smoothing and strip-
ping technique with a skeleton adjustment to the medial
line of the pattern (Chu et al., 1986).
The vector data is based on nodes and vertices. In the first
vectorization step the nodes are extracted from the line
image. À node is represented by a pixel that has either less
or more than two neighborhood pixels belonging to a line
segment. The second step is the conversion of the line
segments connecting the nodes into Freeman chain codes.
Some line structures like circles cannot be converted to
nodes and segments because they consist only of pixels
with two neighbors. These line structures are converted in
the third vectorization step. The vertices connecting the
nodes are created using a split and merge technique on the
Freeman coded line segments. Nodes that are directly
neighbored in the raster image have to be coalesced and
the vertices connected to them have either to be corrected
or deleted. Finally, attribute data like color, line width or
variance of the line width is extracted from the raster image
for each vertex.
4.5 Refinement of vector data
Although the skeleton created by the line thinning algo-
rithm of (Chu et al., 1986) is of high quality, there may be
some unnecessary lines and nodes in the thinned image.
These lines will also be vectorized. They may be removed
[ 1 ^ri 4 AP AP AN ©) be
tic