Both the measurement of urban areas using modified image
classifications, and the modelling of urban areas using fractal-
based analysis, will be applied to Norwich, a medium-sized
settlement in the United Kingdom, approximately at the time of
the 1991 UK Population Census.
MEASUREMENT OF URBAN REMOTELY-SENSED
DATA
In this paper, measurement will relate to image classification, in
particular the conventional maximum-likelihood (ML)
algorithm, its Bayesian modification, and links to GIS data.
Conventional Maximum-Likelihood Classification
As a parametric classifier, the ML algorithm relies on each
training sample being represented by a Gaussian probability
density function, completely described by the mean vector and
variance-covariance matrix using all available spectral bands.
Given these parameters, it is possible to compute the statistical
probability of a pixel vector being a member of each spectral
class (Thomas et al, 1987). The goal is to assign the most
likely class w,, from a set of N classes, wj, . . . , Cy, to any
feature vector X in the image. A feature vector X is the vector
(X,, X4, ... , Xj, composed of pixel values in M features (in
most cases, spectral bands). The most likely class w; for a
given feature vector X is the one with the highest posterior
probability Pr(w/X). Therefore, all Pr(w]X), i e [1 . . N] are
calculated, and w; with the highest value is selected. The
calculation of Pr(w,|X) is based on Bayes’ formula,
Pr(X|w;) x Pr(w;)
Pr(w, IX) = PX)
(0)
On the left hand side is the a posteriori probability that a pixel
with feature vector X should be classified as belonging to class
w, The right hand side is based on Bayes formula, where
Pr(X|w;) is the conditional probability that some feature vector
X occurs in a given class, in other words, the probability
density of w; as a function of X. Supervised classifications,
such as the ML, derive this information from training samples.
Often, this is done parametrically by assuming normal class
probability densities and estimating the mean vector and
covariance matrix. Also on the numerator and coupled with the
conditional probability is what is known in Bayes' formula as
the prior probability of w; , shown as Pr(w;). This is the a priori
probability of the occurrence of w; irrespective of its feature
vector, and as such is open to estimation by prior knowledge
external to the remotely-sensed image. External prior
knowledge will typically include information on the
distribution and relative areas covered by each class in the study
scene and is most readily generated from GIS data. It follows
that the accuracy of class priors is at best equal to the quality of
GIS prior knowledge. In image classification terms, prior
probabilities can be visualized as a means of shifting decision
boundaries to produce larger volumes in M-dimensional feature
space for classes that are expected to be large and smaller
volumes for classes that are expected to be small (Mather,
1985). The denominator in (1), Pr(X) is the unconditional
probability density which is used to normalise the numerator
such that
Pr(X) = » Pr(X|w;) x Pr(w;) (2)
i=l
Normally, ML classifiers assume prior probabilities to be equal
and assign each Pr(w;) a value of 1.0. However, it would seem
intuitively more sensible to suggest that some classes are more
likely to occur than others. By taking account extraneous
information on the areal properties of each spectral class it will
be possible to generate thematic per-pixel classifications that
are more accurate than those produced from conventional ML
techniques (Barnsley et a/, 1989; Maselli et al, 1992). The
paper will now examine how prior probabilities may be
modified to incorporate external GIS information on class area
estimates.
Modification of Prior Probabilities
Before we examine precisely how prior probabilities may be
modified, it is important to stress from the outset that our
modifications can only be conducted within the more general
framework of GIS/RS integration. This requires a systematic
strategy which can co-ordinate the flow and coupling of GIS
data within image classification procedures. In the worked
example, prior probabilities will be modified using a
hierarchical stratification strategy based upon data from the
United Kingdom Population Census. The stratification will
essentially allow census data to assist in the selection and
hierarchical partition of spatial features from a satellite image.
This hierarchical partition is critical to the statistical
assumptions of ML prior probabilities, of which the most
important being that all multi-dimensional feature space is
subdivided between weighted classes. In other words, for prior
probabilities to function most efficiently they need to be
applied to inclusive feature space but mutually-exclusive
classes. This essentially means that for the classification of
mutually-exclusive residential dwelling classes, an image must
only be composed of residential feature space. Census tract
data have already been shown to be amenable to the generation
of pseudo-surfaces of urban representations, especially
residential surfaces (Martin and Bracken, 1991) from which
such stratification is possible. These surfaces have been used
by Mesev (1995) to enhance per-pixel classifications through
training sample selection and post-classification sorting. The
result is that satellite images have been routinely segmented
into “urban” and “non-urban”, as well as “residential urban”
and “non-residential urban” (Mesev et al, 1995). Using the
“residential urban” category it will now be shown how prior
probabilities of the surrogate residential density categories,
“detached”, “semi-detached”, “terraced”, and “apartment”
blocks, may be generated by census data and then inserted into
the ML classifier.
Consider z, as the census variable, “residential building type”
(where k: 1 = “detached”, 2 = “semi-detached”, 3 = “terraced”,
and 4 = “apartments”). When stratified into exclusively
residential feature space, the four classes will have A pixels
with feature values X,, where, X,, . . . , X, are not necessarily
mutually-exclusive. The objective is to find the probability that
a random pixel (within the “residential” stratum of the image)
will be a member of a spectral class w; (where i: 1 = detached, 2
= semi-detached, 3 = terraced, 4 = apartments), given its density
vector of observed measurements X, in m-dimensional feature
space and that it belongs to ancillary class z,, described as
558
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B4. Vienna 1996
It
ori
Wi.
the
prc
Pr
Th
cla:
lab
ass
«de
pro
exa
fou
spa
Pri
Thé
Pr(
the
i bi
pro
Lik
Pr
foui
Em]
Let’
clas:
Eng
Mes
dwe
July
Popi
and
in ez
city
Serio
The
befo;
non