XXII ISPRS Congress 2012: Technical Commission III

33.1  Multi-scale Image Segmentation: To perform multi- 
scale image segmentation, we need to build image pyramid first. 
The original image is at the bottom of the pyramid and 
corresponds to the image at highest(finest) scale. The next 
lower resolutions image is generated by filtering out high- 
frequency information of input image. The process repeats until 
the lowest(coarest) scale image is generated. In this way, a 
series of images at different scale is obtained to represent the 
original image. Wavelet transforms, as a mostly used method to 
generate image pyramid, is adopted in this paper. After the 
image pyramid is built, segmentation is performed to get 
partition result at each scale. Many segmentation algorithms 
have been proposed in the past, we choose watershed transform 
to segment the image for its simplicity and efficiency to execute. 
Since watershed transform is inclined to produce over- 
segmentation(Roerdink J., Meijster, 2000), the geodesic 
reconstruction is applied to alleviate over-segmentation. As a 
result of watershed transform, an image is partitioned into 
disjoint region. At a given scale, a region adjacency graph(RAG) 
is constructed to express the spatial relationship of segmented 
regions. Each node in RAG represents a segmented region. If 
two regions are adjacent to each other, an edge is added in RAG 
to link these two regions. 
3.3.2 Region Feature Representation and Descriptor: 
Initial segmented results contain abundant information, such as 
the region shape, size and context information. Hence, it is 
important to form a discriminative feature vector to represent 
extracted region information. 
Although the mean value and variance of pixels in the same 
region are the most used features in the literatures of region- 
based MRF models(Antonis. Katartzis, et al, 2005) they do not 
contain region context and shape information. Because the 
region context and shape can complement the spectral 
information, they are used to form the region site feature vector 
which is defined as follows: 
y —[Pn, — Ds, (ps, de > [Pr — Pr In(pr)] (8) 
IN, | ro 
Rs is the region that enclose a given site S, P, is the shape 
index of the region to describe its shape, N, is the set of 
neighbour regions of Rs , P, is the shape index of region 
  
adjacent to Rs. The definition of P, is give as follows: 
Da, 7 Sy Suus (9) 
Si, 1s the area of region Rs, $,,,,,is the area of the minimum 
enclosing rectangle of region Rs .If the shape of a region is 
more close to a rectangle, then the value of 2, is more close to 
one. 
Regions that represent a given object class have its own specific 
shape. For instance, most regions that located at house site in 
the input image tend to approach rectangle, while the shape of 
regions extracted at the location of forest or grass land is 
irregular. So, the shape index could be used as complementary 
information to discriminate different land cover classes. The 
average value of shape index of the neighboring regions could 
reflect the region context. The second tern in equation (7) takes 
into account the influence of the shape index of neighbor 
regions and thus reflects the contextual information of 
neighboring regions. 
3.3.3 Algorithm of the UMSRF model: In practical 
implementation, this proposed model firstly executes wavelet 
transform to build image pyramid and then performs image 
segmentation at each scale to get segmented regions by using 
the watershed transform. Based on wavelet decomposition, the 
coefficients of low frequency band are used as pixel level 
observation to compute pixel likelihood. The region level 
observation is acquired by computing the shape index of each 
region to get the region level likelihood. Finally, the 
classification result is obtained by the UMSRF model. The 
algorithm is given as follows: 
1. Perform wavelet transform to generate image pyramid 
2. Execute watershed transform to get segmentation 
results at each scale 
3. Build RAG and extract the regions shape feature to 
represent region shape information using equation(8) 
4. Compute the pixel likelihood at each scale bottom up 
using equation (5) 
5. Estimate the classification result at each scale up 
down by using equation (7) 
The parameter of pixel observation and region observation field 
is estimated by ML algorithm with training data. The transition 
probability between scales is estimated by EM algorithm as 
done in reference(C. Bouman, M. Shapiro, 1994). 
4. EXPERIMENTAL RESULTS 
This section considers the experiment set up of this study. The 
proposed method was tested on aerial images, which has spatial 
resolution of 0.4m and contains 3 bands(red, green and blue). 
The images represent urban environment, which contain many 
man made object, such as building and road. We want to justify 
if the extracted region shape information can improve 
classification performance. 
The aerial image of study area (Taizhou China) is shown in 
Fig.l. We assume that there exist 3 classes in the image, which 
are building, vegetation and road respectively. The initial multi- 
scale segmentation result is shown in Fig.2. We can see from 
the segmentation result that most of the regions corresponding 
to building have regular shape which will used to improve 
classification performance. Fig.3 and Fig4 give the 
classification result by MSRF and UMSRF respectively. 
Vegetation is represented by red colour, build is represented by 
green colour and road is represented by blue colour. Compare 
classification results of MSRF to UMSRF qualitatively, we can 
find that UMSRF is better than MSRF to some extent. For 
example, some pixels of building in the middle of the image are 
misclassified to road since the spectral signal of these pixels is 
more similar to road than house. As a result of only considering 
spectral signal of pixel, they are classified to road instead of 
building. This error is corrected in the result of UMSRF as 
shown in Fig.4. By considering the region shape information, 
we can alleviate the difficult that results from internal spectral 
variability within the same classes of high spatial resolution 
image. 
We also evaluate the results of MSRF and UMSRF 
quantitatively. Confusion matrix of MSRF and UMSRF are 
given in Table I and Table II respectively. There is an increase 
of 10 percent of classification accuracy for building, comparing 
the UMSRF to MSRF. We can also see that the classification 
accuracy for vegetation and road decrease by 6 percent and | 
percent. The decrease of accuracy for road and vegetation show 
us that the UMSRF is not superior to MSRF when the shape of 
region is irregular. This limitation of UMSRF reminds us that 
      
   
    
   
   
   
    
   
   
    
   
    
   
     
    
     
   
   
    
    
  
   
    
  
  
  
   
    
     
   
    
    
    
    
    
    
   
   
     
  
   
  
  
" €
1
2
...
245
246
247
248
249
...
586
587
Full text: Technical Commission III (B3)

Access restriction

Copyright

Note to user