ISPRS Commission III, Vol.34, Part 3A ,Photogrammetric Computer Vision“, Graz, 2002
FEATURE EXTRACTION FOR QUALITY ASSESSMENT OF AERIAL IMAGE SEGMENTATION
V. Letournel®®*, B. Sankur®, F. Pradeilles’ and H. Maître“
“ Dept. TSI, ENST, 75634 Paris Cedex 13, France - letour@tsi.enst.fr
DG'A/Centre Technique d’Arcueil, 16 bis av. Prieur de la côte d’or, 94114 Arcueil Cedex, France - valerie.letournel@etca.fr
* Bogazici University, 80815 Bebek Istanbul, Turkey - sankur@boun.edu.tr
KEY WORDS: Protocol, Performance, Segmentation, Interpretation, Features, IKONOS, Statistics
ABSTRACT
We present a new evaluation methodology and a feature extraction scheme for segmentation algorithms in the context
of photo-interpretation. The novelty of the proposed methodology is that subjective evaluation marks are involved in
the determination of the feature subspace. In fact, our aim is to determine features in alignment with the perception of
photo-interpreters, alternatively called psychovisual features. The proposed methodology was applied to the detection
of building targets in aerial images. More specifically we considered the delineation of polygonal buildings in semi-
urban areas on IKONOS images (1 meter resolution). We determined from the images, concurrently, various objective
performance measures and collected votes of a jury of evaluators. The methodology to find the concordance between
objective features and subjective marks was the canonical analysis of tables.
1 INTRODUCTION
A plethora of image segmentation algorithms have been
advanced in the last decades, and their variety is still on the
increase. There is therefore a urgent need for techniques to
assess objectively the merits and performance advantages
of these algorithms in the context of various vision tasks.
A seminal work in this direction is the method developed
by Zhang (Zhang, 1996), which is based on the accuracy
of feature measurements of the segmented objects.
In the taxonomy of methods for the evaluation of segmen-
tation algorithms several approaches can be distinguished.
One class of a priori methods (Ji and Haralick, 1999) try
to predict the algorithmic performance vis-à-vis generic
inputs, before any implementation. Another class of a
posteriori methods need the actual output of algorithm,
and use, in the absence of ground-truth reference, 'good-
ness of segmentation' measures (Huang and Dom, 1995).
These measures are based on the characterization of the
outcome, such as, the consistency of features within the
segments, smoothness along the contours or high contrast
across the boundaries. However the most common eval-
uation method in the literature relies on the discrepancy
measures as in (Kanungo and Haralick, 1995), that is, the
differences between an ideal segmentation map, called the
"ground-truth" and the actual segmentation outcome. The
typical difference criteria are missed object pixels, false
alarm pixels localization errors, mismatch of edges, shape
discrepancy etc.
It is more relevant to evaluate the usefulness of an image
segmentation algorithm in the context of a specific task
rather than try to address the general segmentation perfor-
mance issue. A case in point is the photo-interpretation
of aerial images where we want to assess how much spe-
cific algorithms and/or features aid in the completion of
vision tasks. In such vision tasks as target detection, bat-
tle damage assessment, delineation of buildings and man-
made objects, the segmentation map represents an interme-
diate level intelligence to the human operators. It is then
necessary that the delineation of the objects and the fea-
tures emphasized be in concordance with the expectation
of the photo-interpreters, hence satisfy human vision re-
quirements.
In this work we limit our task to the extraction of buildings
in aerial images. We first explore the relevant features that
characterize buildings in aerial images, with the ultimate
goal of identifying the "psycho-visual" features, which
are largely correlated with the photo-interpreters' attention
mechanisms. In other words we introduce a perceptual
dimension when we evaluate the performance of segmen-
tation algorithms in terms of "what the photo-interpreters
prefer and judge as relevant".
The organization of the paper is as follows. In Section 2 we
explain the framework of application and the motivation
for a segmentation evaluation methodology, where humans
are in the loop. The proposed methodology is detailed in
Section 3. Results of the selected features and their valida-
tion against the feedback received from photo-interpreters
are detailed in Section 5. Finally Section 6 draws the con-
clusions.
2 PROBLEM STATEMENT
Interpretation and annotation of aerial images is an im-
portant task in various military and non-military contexts.
We intend to establish the qualifications of a segmenta-
tion algorithm judged to be an effective tool by the photo-
interpreters. Since the algorithms are qualified accord-
ing to their goodness-of-segmentation features we have re-
ceived feedback from photo-interpreters in terms of their
subjective quality judgements.
We aim to assess segmentation algorithms based on the
understanding of the reasoning of photo-experts, that will
mimic the human judgment on the quality of an extracted
object. In other words the similarity of the two objects, the
ground-truth object in the scene and the actual extracted
object, will be based on human similarity assessment. For
instance, human judgement is more sensitive to a false
sharp protuberance from an object, albeit small in pixel
count, than to that object's dilation, so that the simple total
A - 199