Full text: XIXth congress (Part B3,1)

i- 
Peter Doucette 
This paper presents a semi-automated clustering-based approach to road centerline vectorization from high resolution 
imagery that represents a variation of Kohonen's self-organizing map (SOM) algorithm (Kohonen, 1997). The 
neurobiologically inspired SOM algorithm belongs to the unsupervised learning (clustering) variety of artificial neural 
networks (ANN). While cluster analysis is often used as a form of unsupervised spectral classification, the interest here 
is to use cluster analysis to find and map spatial structure within spectrally classified road pixels. The spectral 
classification step is what renders the method semi-automated. Beyond this, we assume that no ancillary cartographic 
information is available, e.g. DEM or coarse-resolution vectors. From a binary image of spectrally classified road 
pixels, localized road centerline nodes are mapped via an iterative clustering method. The nodes self-organize according 
to the local density fluctuations of the input, and global structure is provided by linking cluster centers with a minimum 
spanning tree (MST) algorithm. The MST cost function incorporates global cluster statistics to deal with image noise. 
The subsequent tree structure is densified and pruned locally as needed, and candidate loops are closed where 
appropriate. Finally, nodes are ordered for topological consistency, and a vectorized road network is available for direct 
GIS database entry. Since clustering exemplifies a local center-of-gravity approach, it is not sensitive to edge definition 
of road features, and allows for more meaningful exploitation of multispectral imagery versus edge-based methods. 
2 METHODS FOR SELF-ORGANIZATION 
Cluster analysis is a basic method of self-organization. It describes a regression solution that optimally partitions an m- 
dimensional input space containing a set of N vectors, P = {X,, X,..., Xy}, where X; = [Xo 1» X6 20-5 XG, m € N”, into 
a representative set of K cluster centers, or *codebook" vectors, C 2 (W;, W;,..., Wy), where W; € 9i". Each W; 
represents a single "codeword" in the codebook, and in general K «« N, where K must be prespecified. Cluster analysis 
is sometimes referred to as vector quantization (VQ). Usage of the latter term is generally associated with signal 
compression, and the former with pattern recognition. In the present context, we are interested in aspects of both. In 
(Linde et al., 1980) an important distinction is drawn between aspects of cluster analysis and VQ concerning the 
intended application. For example, one application may require the design of a vector codebook from a set of training 
vectors, which are subsequently fixed for use in quantizing (classifying) new vectors outside of the training set, e.g., test 
vectors. In contrast, another application may use the codebook only for purposes of pattern detection within the training 
set; i.e., the training set is the entire set. Our road extraction strategy is an example of the latter, where the 
dimensionality of the input vectors is m = 2 for (x, y) pixel coordinates. 
2.1 Iterative Optimization 
Iterative optimization is the most widely used approach to cluster analysis. The basic idea is to iteratively adjust a set of 
initialized K codebook vectors in such a way as to globally improve a criterion function, e.g., sum-squared-error (SSE). 
The limitations associated with this "hill-climbing" technique include a guarantee of only local optimization (not 
global), and sensitivity to the initial quantity and location of the codebook vectors. Despite these shortcomings, the 
simplicity of implementation and its computational tractability makes iterative optimization a viable approach for many 
applications, including the current one. We consider two familiar techniques: K-means and Kohonen learning. 
2..1 Batch Codebook Updating. The most recognizable clustering method by name is perhaps the K-means 
algorithm (MacQueen, 1967). It is also identified as the generalized Lloyd algorithm (GLA) based on an earlier work by 
Lloyd, which was formally published in (Lloyd, 1982). For completeness, we list the steps of the K-means algorithm: 
Initialize K codebook vectors in the input space. 
Partition the samples according which codeword each sample is closest to; ties resolved arbitrarily. 
For each partition, compute the mean of its associated samples from the previous step. 
Iterate from step 2 until a stopping criterion is satisfied. 
P ur 
Though many different measures are cited in the literature, the L, (Euclidean) norm is a standard choice for the 
similarity measure of step 2. The L, norm is sometimes used to improve computational performance. K-means is 
considered a batch method because the entire codebook is updated at each iteration, which differs from the sequential 
method of Kohonen discussed next. 
2.1.2 The Self-Organizing Map (SOM) is the artificial neural network based on competitive learning developed by 
(Kohonen, 1982). It is distinct from K-means in three respects: 1) codewords are updated sequentially (as opposed to 
batch), 2) codewords are spatially ordered in the input space according to a predefined topology in a independent 
network space, and 3) use of a smoothing kernel and learning rate schedule during codeword updating. A primary goal 
of the SOM is to define a mapping from an input space 9t," onto a network space Ry’, where m 2 d, and typically d < 2. 
Fig. 1 illustrates the topological ordering of a simple one-dimensional SOM network with four neurons, and a two- 
  
International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000. 247 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.