ISPRS Commission II, Vol.34, Part 3A „Photogrammetric Computer Vision“, Graz, 2002
of misclassified pixels may not be after all a good qual-
ity measure. We expect that the difference score attributed
by the evaluation algorithm must well reflect the subjec-
tively perceived differences. Obviously a simple pixel-
to-pixel comparison of the experimental and ground-truth
segmentation maps may prove very inadequate, since they
are devoid of operator’s perception, between the ground-
truth and the actual object.
2.1 The Tverskian approach
A measure which takes subjective assessment of similar-
ity into account is the “feature contrast model” first pro-
posed by Tversky (Santini and Jain, 1999). In the Tver-
skian approach, objects are characterized by a set of bi-
nary attributes, and (dis)similarity is measured in the at-
tribute space, relying on the notion of psycho-visual simi-
larity. Tversky considers separately the effect of matching
features between objects as well as the aspects in which
they differ. They are represented by binary values so that
stimuli of the perceiver are characterized by the presence
or absence of these features.
However it is cumbersome to represent such numerical
(non-categorical) values. Furthermore in computer vision
one cannot usually obtain binary features due to noise in
measurements. This has lead Santini ef al. (Santini and
Jain, 1999) to introduce the use of fuzzy predicates in the
contrast model. So the similarity between two fuzzy sets ®
et V corresponding to measurements made on two images
(for example ® : measurements on the ground-truth and V
on the segmentation result) is expressed as :
S(9,V) — 3, min{ui(®), mi(®)}
-a > max(p;() — p;(¥),0}
-8) , max(ui(W) — m(@),0} (1)
where p; is the membership function of the it^ predicate,
and p is the number of predicates measured on images.
2.2 The proposed approach
In an effort to establish segmentation features relevant to
the human judgement we use the method of canonical
analysis of tables between two feature spaces. One fea-
ture space consists of objective features of the segmented
"building" object; the other feature space consists of sub-
jective features on the same object, expressed categorical
quality points given by people. We transform one space
toward the other to render them as "parallel" as possible.
The degree of parallelism achieved is a measure of the rele-
vance of feature set combinations to the human judgement.
Presently our approach is not based on fuzzy membership
functions as in (Santini and Jain, 1999) but on predictabil-
ity of one set of variables (subjective votes) by the another
set of variables (objective features). Thus the similarity
measure is a linear combination of feature values.
A - 200
We deal with IKONOS images (1 meter resolution) of
semi-urban areas and the task to delineate polygonal build-
ings. Two categories of objective features are considered :
features specific to the geometry of buildings, called intrin-
sic features, and features related to the appearance models,
that is the gray-level contextual information, called extrin-
sic features. In the first set, we focus on form and size
(a rectangular and/or big blob is more significant than a
small and/or non-rectangular one), parallelism of the op-
posite sides, number of corners, regularity of edges, (e.g.,
a closed and smooth contour is more significant). In the
second set we pay attention to the shadow effects near the
edges, gray-level uniformity inside and contrast with im-
mediate surrounding etc..
3 FEATURES OF BUILDINGS IN AERIAL IM-
AGES
The set of intrinsic and extrinsic segmented building fea-
tures we consider are listed below. In what follows Z will
denote a segmented generic building region.
3.0.1 The intrinsic features These features are mea-
sured vis-a-vis the ground-truth data :
1. The elongation index, / A(Z) (Coster and Chermant,
1985), of a segmented region (Z) is defined as :
mL? (2)
AA(Z)
IA(Z) - Q)
where L,(Z) is the geodesic diameter of Z and A(Z)
is its area. Note that for a disk its value is minimum
and equal to 1. The elongation index can be instru-
mental in distinguishing, for example, a *U form"
from a rectangle having the same perimeter and area.
A case in point would be a multi-winged building.
2. The compactness C'(Z) defined by :
_ 4TA(Z)
p°(Z)
where p(Z) is the perimeter of the boundary of Z.
Recall that C(Z) = 1 for a disk, and goes to zero for
very elongated forms or regions with severely jagged
edges.
C(Z) (3)
3. The bounding box BB(Z) (Coster and Chermant,
1985) is constructed along the inertial directions of
the extracted region. Two features have been ex-
tracted indicating the degree of rectangularity of the
region. The first one is the excess difference and the
second one is the excess ratio, respectively, of the
building pixels and of its bounding-box pixels :
D(Z) = A(BB(Z))-A(Z) @
A(Z)
Ha A(BB(Z)) 2