Sharing and cooperation in geo-information technology

Aziz, T. Lukman
75 
2. CLUSTER ANALYSIS (CRIPPA, MUSSIO, 1995) 
Cluster analysis collects different statistical procedures and 
algorithms, able to classify data, to gather them in sets and to 
give a preliminary explanation about their behavior. 
The aim of cluster analysis is to minimize the internal 
dispersion of a group of data (cluster), respect their center 
(cluster point), and to maximize, at the same time, the distance 
among the above defined clusters. 
The procedures of cluster analysis could be subdivided, taking 
into account the starting point, the collection strategy and the 
end point respectively in: 
• agglomerative or divisive techniques 
• hierarchical or not- techniques 
• clustering of clumping 
Qualitative and quantitative cluster analysis could be performed 
different kinds of algorithms; MEDOIDS algorithm is 
recommendable in the first case, whilst the second task could be 
achieved by ISODATA algorithm. 
The main steps of ISODATA algorithm are the following: 
• selection of preliminary cluster points and thresholds for 
the iterations 
• clustering in a cluster the elements neighboring a cluster 
point. 
• division of clusters, whose internal dispersion is greater 
than the chosen threshold 
• new clustering 
• fusion of cluster, whose distance is less than the chosen 
threshold 
• starting a new iteration from the 2 nd step, or stopping the 
algorithm if a reproducing point is achieved. 
The step of clustering could be done in a different way. When 
the objects are well configured, the Euclidean norm or the 
Manhattan distance help to cluster the surrounded elements. 
On the contrary, when the objects are ill configured, these 
norms show evident difficulties to cluster the elements. 
Hence specific techniques of parsing, like line following, region 
growing, to be repeated by means of split and merge iterations, 
provide the expected results. The parsing is a methodology of 
Linguistics which has already been applied to artificial 
intelligence, image processing and robotics. 
Thus the clustering of ill-configured object could be obtained by 
using these techniques whose main steps are the following: • 
• selection in a cluster of the element which is the closest to 
its cluster point 
• selection in the same cluster of (if any) elements slightly 
more distant from the cluster point, thus taking into 
account possible bifurcations 
• attribution of the role of cluster point to the just selected 
elements 
• reiteration of the 1 st step 
• reiteration of the 2 nd step 
• reiteration of the 3 rd , 4 th and 5 th steps as long as selection of 
elements is possible 
• selection of a possible not yet selected element as a new 
cluster point; otherwise stop the algorithm 
• reiteration, for the new cluster point only, of the 1 st 5 th 
steps as long as a selection of elements is possible 
when preliminary cluster points are not available, the algorithm 
begins flatly at the 7 th step and it proceeds to form isolated 
clusters. Notice that the construction of cluster the one after the 
other, instead of all together, is less robust respect to wrong 
selection of elements. Finally the same strategy, used for the 
clustering, could be applied for the cluster fusion by means of a 
cluster of clusters. 
3. GRAPH THEORY (BELLONE, ET AL., 1996B) 
Graph theory permits to study topological relations underlying 
the geometry of different kinds of objects. A graph is a set of 
nodes connected by arcs. In a connected graph a path always 
exists between two nodes; the shortest path is called distance 
and the largest distance called diameter. Each node of a 
connected graph is reached by at least one arc; the number of 
arcs of a node is called degree. 
A connection matrix could be associated to each graph, being its 
main diagonal elements coincident with the nodes and its off 
diagonal elements coincident with the arcs. A corresponding 
design matrix supplies in form of chart the connections between 
arcs and nodes. A second and more refined design matrix 
supplies in forma of chart the independent circuits and 
constrained polylines among the arcs. 
Since connection and design matrices are generally very sparse 
ones, they should be stored avoiding their zero elements. Design 
matrices tabulated by rows or columns and banded, profile or 
sparse connection matrices are very suitable way to store them. 
Connection matrices may have non-zero elements near their 
main diagonal ones or anyway located. The arrangement of 
non-zero elements could be improved by changing the order of 
elements. The arrangement of the elements inside design 
matrices could be also improved according to the order of the 
elements achieved for the corresponding connection matrices. 
The most important algorithms for numeration, ordering and 
dissection of matrix elements use graph theory. 
A level structure is a partition of a graph in different levels 
whose nodes are connected to the ones of the preceding and/or 
following level only. A level structure whose first level has one 
node only, said root, is a tree. The number of nodes belonging 
to a level is called width; the width of a level structure is the 
maximum of the level width, whilst the number of levels is 
called depth. The nodes belonging at each last local level are 
called leaves. The distance of a generic node from the root 
represents its (local) depth and the distance of same generic 
node from the leaves represents its (local) height. 
Relationship among matrices, graphs and level structures show 
that the bandwidth of connection matrices depends upon the 
numeration of the nodes of corresponding graphs’, an adequate 
numeration requires a small width of the level structures and 
this last property is easily achieved as their depth is large. 
Furthermore since the minimization of bandwidth and profile 
could not always be performed at the same time, the preliminary 
dissection of a graph and the subsequent ordering of its 
dissected parts supply the expected results. 
I' 1 algorithm - Search of a diameter of a graph 
The reiteration of the construction of level structures allows to 
find the diameter of a graph. An arbitrary starting root initiates 
the first level structure, whilst the nodes in the last level initiate 
new level structure, to be substituted to the first one, if their 
depth is greater. The same procedure has to be repeated until 
two level structures of equal depths are found.
1
2
...
96
97
98
99
100
...
158
159
Full text: Sharing and cooperation in geo-information technology

Access restriction

Copyright

Note to user