egion number 415
location of the
415
228
,4 249.4
119.7
313.0
: 173.3
3.5
0.5
100
i 164
3 273
69
4
.1 50.9
.0 49.5
.8 9.0
ire another measure
le number of
be divided by its
:suyama(1980).
by deviding the
:onvex hull,
iphs can be created
mcy matrix is
:oordinates and the
tself.
:s of the segmented
converted to a
ilygons can be
consists only of
le 2.
Tories, where
d of the category
roperty list of the
lation. The results
ns are listed in
o real correlation
ategories. Only by
ricultural landuse
or rotation of the
ese factors are
including pure
ill only be
ined on a higher
suyama(1980) i.e.
ings, woods,
ly there will be
reen elongatedness
issification first
Table 3. Correlation tests on fields with their
agricultural class. Explanation: CATEGORY versus
Perimeter, Size, Rotation angle as described above,
Elongatedness, Fit in MBR, a second rotation
measurement, and the means of Channels 1 to 4.
First coefficient = Pearson coefficient,
Lower coefficient = the significance probability of
the correlation.
PERIM
SIZE
ROTI
ELONG
FIT
CAT
.1308
.0353
.1179
.0580
.0252
.6861
.0699
.2625
.0142
.8197
CHI
CH2
CH3
CH4
ROT 2
CAT
.2286
.0002
.3271
.0001
.3164
.0001
.3268
.0001
.0760
.2229
an expert has to be consulted.
It is also almost impossible to create a spatial
'logic' rule of neigbourness with these categories in
this area. A bridge will be connected to a river, a
car will be adjacent to a road, house or a parking
lot but what will be the neigbours of a wine field,
moorlands or alfalfa?
Leaving the spatial 'reasoning' only the
multispectral properties of the segments are left.
5 THE CLASSIFICATION
Previous attempts by Megier(1984) with the pixel by
pixel classification using a maximum likelihood
classifier showed that a non weighted average of 51.3
percent of the pixels in the test areas could be
classified correctly. Having the 'contours' of each
field together with all the image statistical
parameters a per-field classification can be applied.
Each field is represented by a vector in a
multi-dimensional space created by all the attributes
from the property list.
A clustering algorithm using the "mutual nearest
neighbour"(1974) is applied to 1232 regions. Tests
with different variables have been carried out
clustering down to 40 nodes.
The training polygons are used to assign the
cluster labels to one of the six classes. Furthermore
the ability to separate the training regions itself
by the clustering procedure is tested and given in a
percentage. Only when the training regions have an
'acceptable' level of identification the results of
the test regions have a meaning. In contradiction to
the correllation test of form-parameters, having them
available, some clusterings have been made to confirm
the meaningless of these attributes for agricultural
classification in this area. Table 4 gives the
results.
Being on the slippery path of statistics a few
things can be noticed. An ideal situation is
simulated, that is starting with only a few training
areas and testing the performance on many known
fields. Therefore the most important results in
table 4 are the columns TR and TE. Column TR shows
how good the training fields are chosen for the
classification in this scheme. The higher TR the more
meaning TE has. The results show that the median of
channel 1 or 2 with channel 3, eventually added with
FIT (probably accidental), give the best performance
and will increase the accuracy compared to the per
pixel classification with 15 percent. To this result
must be added that the median of channel 2 and 3 can
describe 80.1 percent of the test regions. Problems
rise when clusters must be assigned to a ground truth
class. Almost every cluster will appear in more than
1 class. With a simple Bayesian decision rule a
cluster is relabeled with a class. This leads to the
result that the biggest class will get the best
performance. This is not satisfying. Other
(non-parametric and parametric) classifiers must be
Table 4. Results of clustering segmented image, where
ME = mean of percentages in all classes of correct
classification
OV = correct classification in perc. of train and
test regions
TR = perc. of correct classification of training
regions
TE = perc. of correct classification of test regions
LO = num. of classes not possible to assign to a
ground truth class
CL = num. of clusters not possible to assign to a
ground truth class
IN = variables used to create vectors
- = no sense in calculating because ground truth
classes are lost after clustering
ME
OV
TR
TE
LO
CL
IN
-
-
62.8
28.0
1
5
ROTATION
-
-
55.7
37.1
1
13
ELONGATEDNESS
20.6
45.2
57.7
42.1
0
7
FIT
20.1
40.0
60.4
32.8
0
14
ROT,ELONG,FIT
40.7
66.2
76.2
63.7
0
10
FIT, MED. CH 2,3
32.7
60.0
67.5
57.0
0
16
ROT,ELONG,FIT
43.4
66.0
77.0
62.7
0
10
MEDIANS CHAN. 2,3
42.1
67.4
76.3
64.0
0
9
MEDIANS CHAN. 1,3
40.3
63.4
78.1
58.6
0
9
MEANS CHAN. 2,3
applied to the data. Together with a more precise
choice of the ground truth classes and a better
segmentation procedure the results will probably
improve.
6 SOMETHING ABOUT EXECUTION TIME
All programs are developed on a VAX 785 mini-computer
in VAX Pascal with some calls to Fortran subroutines
(mainly image file I/O). System management allows
tasks with a size of 4 Mbt. leaving 4 Mbt. for the
system and overhead. The image consists of 232 by 216
pixels. All mentioned times are execution times based
on a single use of the system.
The edge preserving smoothing uses 40 seconds per
iteration.
Edge extraction, connecting and cleaning uses 5
minutes.
The edge tracking and region storage program works
with a 3 by 3 subimage following the inside of a
region. This requires a buffer of 3 adjacent lines.
Following the edge this buffer has to be updated by
changing the index numbering of the buffer or reading
in a new line on a not used part of the buffer
depending on its direction. The number of I/O's the
program has to perform depends on the number of
intersections of edges with each scanline. This is
very time consuming. Therefore two versions of this
program are written. The first program does actually
the I/O from disk n-times. The second program reads
the edge map into memory and copies the line number
into the wanted place of the buffer (memory to
memory). The advantage of the first method is that
the program can be run virtually on every computer.
The second version needs much more memory space but
reduces edge tracking time from around 50 minutes to
10 minutes execution time. The number of tracked
fields is about 3000.
REFERENCES
Devijver, P.A. 1974. On a new class of bound on
Bayes risk in multihypothesis pattern
recognition IEEE Trans, on Computers, vol.70.
Freeman, H. 1961. On the encoding of arbitrary
geometric configurations, IRE Trans, vol. EC-10,
p.266-268, June 1961.
Freeman, H. 1974. Computer processing of line
drawing images. ACM Comput. Survey, vol.6,
p.57-79.