Full text: Resource and environmental monitoring (A)

   
n of mixed 
n output. 
. 491-499. 
and cover 
mputers & 
"oundation. 
p. 138-199. 
of remotely 
vork. ISRS 
bad, India: 
for spectral 
ral network. 
e sensing, 
ormance of 
] unmixing 
ote Sensing 
ficial neural 
ion satellite 
Y 
Nhite, K.H., 
ata for crop 
sing, 13(3), 
86.Learning 
ielhart, D.E. 
Processing: 
Cambridge, 
odelling and 
J. Remote 
Least-square 
from remote 
and Remote 
note sensing 
. 194-201. 
the potential 
ificial neural 
sing, 63(11), 
IAPRS & SIS, Vol.34, Part 7, "Resource and Environmental Monitoring", Hyderabad, India,2002 
GENERALIZED 2D CLUSTERING TECHNIQUE FOR SPATIAL DATA ANALYSIS 
K. Venugopala Rao ", M.V. Ravi Kumar? A. Ravi Kumar? T.CH. Malleswara Rao > 
C.B.S. Dutt®,B.S Prakasa Rao * 
* Scientist, NRSA, Dept. of Space, Govt. of India, Hyderabad 
(venu, koppaka, ravikumar. mv, addepalli)  nrsa.gov.in 
? Group Head, AF, NRSA, Dept. of Space, Govt. of India, Hyderabad 
tchmr@nrsa.gov.in 
  
* Former GH, FEG, NRSA, Dy. Program Director, IGBP, ISRO, Hq, Bangalore 
dutt cbs€ yahoo.com 
^ Professor, Dept. of Geo Engineering, Andhra University, Visakhapatnam 
bosukonda@rediffmail.com 
KEYWORDS: Cluster, Centroid, Euclidean distance, triangulation, convex hull 
ABSTRACT: 
Clustering is a process used to solve classification problems. Its objective is to group data points into clusters so that the degree of 
association is strong between members of the same cluster and weak between members of different clusters. Many of the presently 
used clustering techniques are based on statistics and work with an assumption that the underlying data distribution is spherical. 
These techniques have significant limitations on shape of distribution. All the distances are calculated relative to the centroids. These 
centroids are representative of all points allocated to a particular cluster. These techniques fail to evaluate non-spherical distribution 
data models. This paper presents a more generalized and shape independent 2D clustering technique that makes full and efficient use 
of geometric relationship within the spatial data. In the proposed technique, first all the points of the spatial data set are triangulated, 
then the edges of the triangles are used to generate edge graph. The neighbors of any node in edge graph associate are the nodes of 
its adjacent edges. A global optimum edge length value is dynamically computed to partition the edge graph into inter-and intra 
clusters. The significant advantage of this graph-based partition is that the number of clusters need not be known in advance. 
1. INTRODUCTION 
Cluster analysis divides data into groups or clusters in order to 
improve understanding or finding structure in data. Clustering 
algorithms divide a data set into clusters, where similar data 
objects are assigned to the same cluster and dissimilar data 
objects belong to different clusters. Clustering techniques fall 
into a group of undirected data mining tools. The goal of 
undirected data mining is to discover the underlying data 
structure as a whole. Many clustering techniques are based on 
statistics and work with an assumption that the underlying data 
distribution is spherical (Jain et al., 1988; Euripides et al., 
1997; Athman 1996). K-means clustering algorithm is the 
best choice to illustrate this. 
‘1.1 K-means Algorithm 
K-means algorithm is a simple, iterative procedure, in which a 
crucial concept is the centroid. Centroid is an artificial point, 
which represents an average location of the particular cluster 
(Cormen et al, 1985). The coordinates of this point are 
averages of attribute values of all points that belong to the 
single cluster. The steps of the K-means algorithm are: 
1. Select randomly k points to be initial.centriods of k 
clusters. 
2. Assign each point to the closest centroid. 
3. Calculate new centroids of the clusters. The mean 
Euclidean distance of the points belonging to the 
same cluster could be used to calculate the new 
centroids. 
4. Check if the cluster centroids have changed their 
coordinates. If yes, start again from the step.2. If not, 
cluster detection is finished. 
Figure 1 shows how the algorithm works in the case where k=2. 
Randomly select the values for m; and m,, as seed centroids 
and the algorithm converges when the means no longer change. 
   
  
final centroid 
Start my 
Figure 1. Illustration of k- means Algorithm. 
Any algorithms like k-means based on distances that are 
calculated relative to the centriods assume that the data 
distribution is spherical. These algorithms .completely fail on 
non-spherical distribution data sets (Duda et al 1998). In many 
real world situations shape independent clustering techniques 
are needed. This paper presents generalized 2D-clustering 
technique that is unaffected by the shape of the clusters. In the 
proposed technique all the data points are first triangulated and 
the edges are used to generate a graph. Then the resulting edge 
  
    
  
   
    
  
  
   
   
   
   
  
   
   
  
  
  
  
  
  
  
  
  
  
   
  
   
  
    
  
   
  
  
  
  
  
  
  
   
    
  
   
   
  
   
  
    
  
  
   
   
  
  
  
   
  
  
   
   
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.