33.1 Multi-scale Image Segmentation: To perform multi-
scale image segmentation, we need to build image pyramid first.
The original image is at the bottom of the pyramid and
corresponds to the image at highest(finest) scale. The next
lower resolutions image is generated by filtering out high-
frequency information of input image. The process repeats until
the lowest(coarest) scale image is generated. In this way, a
series of images at different scale is obtained to represent the
original image. Wavelet transforms, as a mostly used method to
generate image pyramid, is adopted in this paper. After the
image pyramid is built, segmentation is performed to get
partition result at each scale. Many segmentation algorithms
have been proposed in the past, we choose watershed transform
to segment the image for its simplicity and efficiency to execute.
Since watershed transform is inclined to produce over-
segmentation(Roerdink J., Meijster, 2000), the geodesic
reconstruction is applied to alleviate over-segmentation. As a
result of watershed transform, an image is partitioned into
disjoint region. At a given scale, a region adjacency graph(RAG)
is constructed to express the spatial relationship of segmented
regions. Each node in RAG represents a segmented region. If
two regions are adjacent to each other, an edge is added in RAG
to link these two regions.
3.3.2 Region Feature Representation and Descriptor:
Initial segmented results contain abundant information, such as
the region shape, size and context information. Hence, it is
important to form a discriminative feature vector to represent
extracted region information.
Although the mean value and variance of pixels in the same
region are the most used features in the literatures of region-
based MRF models(Antonis. Katartzis, et al, 2005) they do not
contain region context and shape information. Because the
region context and shape can complement the spectral
information, they are used to form the region site feature vector
which is defined as follows:
y —[Pn, — Ds, (ps, de > [Pr — Pr In(pr)] (8)
IN, | ro
Rs is the region that enclose a given site S, P, is the shape
index of the region to describe its shape, N, is the set of
neighbour regions of Rs , P, is the shape index of region
adjacent to Rs. The definition of P, is give as follows:
Da, 7 Sy Suus (9)
Si, 1s the area of region Rs, $,,,,,is the area of the minimum
enclosing rectangle of region Rs .If the shape of a region is
more close to a rectangle, then the value of 2, is more close to
one.
Regions that represent a given object class have its own specific
shape. For instance, most regions that located at house site in
the input image tend to approach rectangle, while the shape of
regions extracted at the location of forest or grass land is
irregular. So, the shape index could be used as complementary
information to discriminate different land cover classes. The
average value of shape index of the neighboring regions could
reflect the region context. The second tern in equation (7) takes
into account the influence of the shape index of neighbor
regions and thus reflects the contextual information of
neighboring regions.
3.3.3 Algorithm of the UMSRF model: In practical
implementation, this proposed model firstly executes wavelet
transform to build image pyramid and then performs image
segmentation at each scale to get segmented regions by using
the watershed transform. Based on wavelet decomposition, the
coefficients of low frequency band are used as pixel level
observation to compute pixel likelihood. The region level
observation is acquired by computing the shape index of each
region to get the region level likelihood. Finally, the
classification result is obtained by the UMSRF model. The
algorithm is given as follows:
1. Perform wavelet transform to generate image pyramid
2. Execute watershed transform to get segmentation
results at each scale
3. Build RAG and extract the regions shape feature to
represent region shape information using equation(8)
4. Compute the pixel likelihood at each scale bottom up
using equation (5)
5. Estimate the classification result at each scale up
down by using equation (7)
The parameter of pixel observation and region observation field
is estimated by ML algorithm with training data. The transition
probability between scales is estimated by EM algorithm as
done in reference(C. Bouman, M. Shapiro, 1994).
4. EXPERIMENTAL RESULTS
This section considers the experiment set up of this study. The
proposed method was tested on aerial images, which has spatial
resolution of 0.4m and contains 3 bands(red, green and blue).
The images represent urban environment, which contain many
man made object, such as building and road. We want to justify
if the extracted region shape information can improve
classification performance.
The aerial image of study area (Taizhou China) is shown in
Fig.l. We assume that there exist 3 classes in the image, which
are building, vegetation and road respectively. The initial multi-
scale segmentation result is shown in Fig.2. We can see from
the segmentation result that most of the regions corresponding
to building have regular shape which will used to improve
classification performance. Fig.3 and Fig4 give the
classification result by MSRF and UMSRF respectively.
Vegetation is represented by red colour, build is represented by
green colour and road is represented by blue colour. Compare
classification results of MSRF to UMSRF qualitatively, we can
find that UMSRF is better than MSRF to some extent. For
example, some pixels of building in the middle of the image are
misclassified to road since the spectral signal of these pixels is
more similar to road than house. As a result of only considering
spectral signal of pixel, they are classified to road instead of
building. This error is corrected in the result of UMSRF as
shown in Fig.4. By considering the region shape information,
we can alleviate the difficult that results from internal spectral
variability within the same classes of high spatial resolution
image.
We also evaluate the results of MSRF and UMSRF
quantitatively. Confusion matrix of MSRF and UMSRF are
given in Table I and Table II respectively. There is an increase
of 10 percent of classification accuracy for building, comparing
the UMSRF to MSRF. We can also see that the classification
accuracy for vegetation and road decrease by 6 percent and |
percent. The decrease of accuracy for road and vegetation show
us that the UMSRF is not superior to MSRF when the shape of
region is irregular. This limitation of UMSRF reminds us that
" €