ISPRS Commission III, Vol.34, Part 3A ,,Photogrammetric Computer Vision“, Graz, 2002
where n is the number of normals.
4 METHODOLOGY
In order to select features of buildings in aerial images
that are both statistically discriminating, and at the same
time judged relevant by photo-interpreters, we first build
a ground-truthed segmented image database. Then we ob-
tain the subjective scores by having evaluators to vote on
their quality. The evaluators view the displayed segmenta-
tion outcome from algorithms side by side with their orig-
inals. Concurrently we extract from the segmented images
and their ground-truths objective performance scores based
on feature differences as in Section 3. Finally we carry
out a canonical analysis of subjective and objective quality
scores in order to obtain the best possible match between
these two tables and thus determine the relevant "psycho-
visual features". The details of the proposed methodology
are described in the following paragraphs.
4.1 Construction of the Segmented Image Database
We have selected four characteristic segmentation algo-
rithms. These belong to the segmentation paradigms based
on image discontinuity, image similarity and feature-space
clustering. Several varieties of each algorithm were ob-
tained by adjusting their parameters. We used 1) a split-
and-merge algorithm by Suk (Suk and Chung, 1983) (three
parameters to be set) ; 2) the Canny-Deriche edge detec-
tor followed by hysteresis thresholding and edge closing
(four parameters involved) ; 3) a feature-space algorithm
that uses watersheds of the image histogram, smoothed by
a multi-fractal measure (Kam, 2000) (four thresholds) ; 4)
an image similarity algorithm, the seeded region-growing
algorithm (Gagalowicz and Monga, 1985) (one parameter
involved).
As image material, we have chosen nine sub-images from
an IKONOS image (1 meter resolution) of the area of Al-
giers. The scenes are rich with polygonal buildings. For
each image we have established the ground truth by man-
ual tracing with a photo-interpreter tool. Using different
settings of the parameters of the above segmentation algo-
rithms we obtained in total 160 segmentation outcomes.
Seven of nine images have been segmented, each, with
20 variations of the segmentation algorithms while two of
them with only 10 versions. This gave us a total of 160
(=20*7 + 10*2) segmented scenes to be voted on (Letour-
nel, 2000). In the sequel we will refer to the segmentation
result obtained with a given algorithm and a given param-
eter setting simply as "segmentation".
4.2 Subjective segmentation measures
A group of subjects evaluated the set of segmented images
and gave their assessment marks. The marks were in the
[-2, 2] range, going from the lowest quality mark of '-2'
as "unacceptable" to +2’ meaning “near perfect”. The
subjects could view side by side the segmented test im-
age and its “perfect” ground-truth segmented version. An
instance of the test image is shown in figure 3 (edges are
Figure 3: A segmented image to be marked (edges are pre-
sented in white).
Figure 4: Ground-truth segmentation used as reference to
mark segmentation on figure 3 (edges are in white).
in white) and its reference image in figure 4. To avoid
any fatigue effects on the voters we decided to partition
the segmentation database of 160 images into 4 groups of
40 images. Each voter was randomly assigned to one of
these 4 groups. We made sure that groups are formed of a
fair distribution of “good” and “bad” segmentations.
4.3 Elaboration of the features space
We have first used the principal component analysis (PCA)
on the subjective features, with the goal of ascertaining the
coherence among the evaluators. Secondly we have ap-
plied PCA to the objective features to understand if there
would be a more appropriate subspace describing them. Fi-
nally we studied the two sets (objective and subjective fea-
tures) jointly using Canonical Analysis (CA). Recall that
the aim of this particular statistical tool (CA) (Saporta,
1990) is to put into evidence any linear relationship that
may exist between two sets of quantitative measurements
on the same events.
More formally when n events are described by two sets of
variables (respectively, of dimension p and q), one searches
for a linear combination of variables of set 1 (P) and a lin-
ear combination of variables of set 2 (C) that are the most
correlated with each other. In our context these two sets
are obviously the set of objective features measured (P)
and the set of marks given by evaluators (C). The n obser-
vations consist of the 40 segmentations in each group, each
taking place in R™. Notice that the CA is run separately for
each one of the four groups. Let’s denote by x1, ..., xP the
objective feature measurements, where p = 12 and com-
ponent x! is the ith feature. All this data is organized in
A- 202