Full text: Proceedings, XXth congress (Part 3)

  
   
   
   
   
  
   
  
  
  
  
   
    
   
   
   
   
  
  
   
      
   
   
  
   
  
  
  
   
   
   
  
  
  
  
  
  
   
   
  
   
    
  
  
    
  
  
   
  
  
  
  
     
   
   
   
    
   
   
     
  
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004 
coordinates and intensity. This assumption seems to apply to all 
kinds of circumstance. In this paper, we attempt to use self- 
organizing maps (SOM) (Kohonen, 2001), an unsupervised 
clustering technique to make a classification of points with a 
point cloud. We adopt SOM training algorithm to group all 
points into different categories according to their xyz 
coordinates and intensity. Through the trained SOM - a two 
dimensional grid of neurons, the similarity of points can be 
interactively explored and visualized. Thus we are able to 
distinguish different points belonging to different objects. 
As a well-developed technique, SOM has found many 
applications in various fields such as data classification, pattern 
recognition, image analysis, and exploratory data analysis (for 
an overview, see Oja and Kaski 1999). In the domain of GIS, 
Openshaw and his colleagues have used the approach in spatial 
SOM is a well-developed neural network technique for data 
clustering and visualization. It can be used for projecting a large 
data set of a high dimension into a low dimension (usually one 
or two dimensions) while retaining the initial pattern of the 
dataset. That is, data samples that are close to each other in the 
input space are also close to each other on the low dimensional 
space. In this sense, SOM resembles a geographic map 
concerning the distribution of phenomena, in particular 
referring to first law of geography: everything is related to 
everything else, but near things are more related to each other 
(Tobler 1970). Herewith we provide a brief introduction to the 
SOM; readers are encouraged to refer to more complete 
descriptions in literature (e.g. Kohonen 2001). 
2.1 Basic principle 
Let's represent a d-dimensional dataset as a set of input vectors 
of d dimensions, i.e. Y — fo. x 2x) where n is the size of 
the dataset or equally the number of input vectors. The SOM 
training algorithm involves essentially two processes, namely 
vector quantization and vector projection (Vesanto 1999). 
Vector quantization is to create a representative set of vectors, 
so called output vectors from the input vectors. Let’s denote the 
output vectors as M = {m,,m,,...m, } with the same dimension as 
input vectors. In general, vector quantization reduces the 
number of vectors, and this can be considered as a clustering 
process. The other process, vector projection, aims at projecting 
the k output vectors (in d-dimensional space) onto a regular 
tessellation (i.e., a SOM) of a lower dimension, where the 
regular tessellation consists of Kk neurons. In the vector 
projection each output vector is projected into a neuron where 
the projection is performed as such, “close” output vectors in d- 
dimensional space will be projected onto neighbouring neurons 
in the SOM. This will ensure that the initial pattern of the input 
data will be present in the neurons. 
The two tasks are illustrated in figure 1, where both input and 
output vectors are represented as a table format with columns as 
dimension and rows as ID of vectors. Usually the number of 
input vectors is greater than that of output vectors, i.e. n¢ k, 
and the size of SOM is the same as that of output vectors 
without exception. In the figure, the SOM is represented by a 
transitional color scale, which implies that similar neurons are 
being together. It should be emphasized that for an intuitive 
explanation of the algorithm, we separate it as two tasks, which 
are actually combined together in SOM without being sense of 
one after another. 
data analysis to carry out the classification of census data 
(Openshaw 1994, Openshaw et al. 1995). It has been applied to 
cartographic generalization for building typification (e.g. 
Hojholt 1995), street selection (Jiang and Harrie 2004), and line 
simplification (Jiang and Nakos 2003). All these studies rely on 
the SOM’s ability in data clustering and pattern recognition. 
This paper will look at how it can be used for filtering laser- 
scanning data in deriving spatial object models from laser- 
scanning datasets. The remainder of this paper is structured as 
follows. Section 2 introduces the basic principle and algorithm 
of SOM. Section 3 presents a SOM-based approach for deriving 
different clusters within a laser scanned point cloud. Section 4 
presents a case study for validation of the approach. Finally 
section 5 concludes the paper and points out future work. 
2. SELF-ORGANIZING MAP 
  
  
  
  
  
    
  
  
  
  
Input vectors Output vectors SOM 
Kom. c E 
Nisi rs S 1.12 ]--- d = 
1 © Le 
N = 
c ‚© 
2 S S 
= ä 
= 6 8 
1 o o 
- 2 2 
ordi ee 
n 
(d-dimension) (d-dimension) (2-dimension) 
Figure 1: Illustration of SOM principle 
2.2 The algorithm 
The above two steps, vector quantization and vector projection, 
constitute the basis of the SOM algorithm. Vector quantization 
is performed as follows. First the output vectors are initialized 
randomly or linearly by some values for its variables. Then in 
the following training step, one sample vector x from the input 
vectors is randomly chosen and the distance between it and all 
the output vectors is calculated. The output vector that is closest 
to the input vector x is called the Best-Matching Unit (BMU), 
denoted by m,: 
lx m. ll? min (llx m, ID [1] 
where l|. | is the distance measure. Second the BMU or 
winning neuron and other output vectors in its neighbourhood 
are updated to be closer to x in the input vector space. The 
update rule for the output vector 7 is: 
m,(t 1) - m, (£) +a(r)h,(0[x(4)-m,(6)] forie N (1) [2] 
m,(t+1)=m,(t) forie N (1) 
where x(f)is a sample vector randomly taken from input 
vectors, m;(f) is the output vector for any neuron / within the 
neighbourhood Nc(?), and g(r) and n (#) are the learning rate 
function and neighbourhood kernel function respectively. 
The algorithm can be described in a step-by-step fashion as 
follows. 
Step 1: Define input vectors in particular their multiple 
variables that determine an attribute space. 
The input vectors are likely to be in a table format as shown in 
Figure 1, where d variables determine a d-dimensional attribute 
   
Internat 
space. E 
be impo 
Step 2: 
be used 
The size 
determi 
be easy 
(Wilppu 
or 2-dir 
are allo 
default : 
Step 3: 
At the i 
linearly 
is impo: 
process. 
Step 4: 
involvin 
function 
The nun 
map la 
Howeve 
better 
Neighbc 
‘gaussia 
but gaus 
h i: 
where O 
distance 
noted th 
a functi 
and end. 
The tra 
inversel 
10). Fo 
adopted 
length a 
3. SOI 
3.1. Pri 
Laser s 
dimensi 
For a ¢ 
which c; 
differen: 
the size 
together 
determir 
structure 
which r
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.