In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009
techniques in clustering shows that the swarm-based clustering
algorithm performs better than the A-means algorithm.
Clustering of massive LIDAR data and the unique potential of
artificial bee colony algorithm in solving complex optimization
problems are the core of this paper. The research work
presented in this paper clearly show that the artificial swarm bee
colony algorithm has clearly outperform A-means method in
clustering of LIDAR data.
3.1 Artificial Bee Colony Algorithm
A colony of honey bees can extend itself over long distances in
order to exploit a large number of food sources (Camazine et
al., 2003; Pham et al., 2006) . The foraging process begins in a
colony by scout bees being sent to search for promising flower
patches. Flower patches with large amounts of nectar or pollen
that can be collected with less effort tend to be visited by more
bees, whereas patches with less nectar or pollen receive fewer
bees (Camazine et al., 2003).
In the artificial bee algorithms, a food source position
represents a possible solution to the problem to be optimized.
Therefore, at the initialization step, a set of food source
positions are randomly produced and also the values of control
parameters of the algorithm are assigned. The nectar mount of a
food source corresponds to the quality of the solution
represented by that source. So the nectar amounts of the food
sources existing at the initial positions are determined. In other
words, the quality values of the initial solutions are calculated.
Each employed bee is moved onto her food source area for
determining a new food source within the neighbourhood of the
present one, and then its nectar amount is evaluated. If the
nectar amount of the new one is higher, then the bee forgets the
previous one and memorizes the new one. After the employed
bees complete their search, they come back into the hive and
share their information about the nectar amounts of their
sources with the onlookers waiting on the dance area. All
onlookers successively determine a food source area with a
probability based on their nectar amounts. If the nectar amount
of a food source is much higher when compared with other food
sources, it means that this source will be chosen by most of the
onlookers. This process is similar to the natural selection
process in evolutionary algorithms. Each onlooker determines a
neighbour food source within the neighbourhood of the one to
which she has been assigned and then its nectar amount is
evaluated.
3.2 Artificial Swarm Bee Colony Clustering Method
The artificial swarm bee colony clustering method exploits the
search capability of the Bees Algorithm to overcome the local
optimum problem of the A-means algorithm. More specifically,
the task is to search for appropriate cluster centres (c /t c 2 ,...,cij
such that the clustering metric ¿/(equation 1) is minimised. The
basic steps of this clustering operation are:
1. Initialise the solution population.
2. Evaluate the fitness of the population.
3. While (stopping criterion is not met)
a. Form new population.
b. Select sites for neighbourhood search by means of
information in the neighbourhood of the present one.
c. Recruit bees for selected sites (more bees for the best
e sites) and evaluate fitness values.
d. Select the fittest bee from each site.
e. Assign remaining bees to search randomly and
evaluate their fitness values.
End While.
Each bee represents a potential clustering solution as set of A
cluster centres and each site represent the patterns or data
objects. The algorithm requires some parameters to be set,
namely: number of scout bees (n), number of sites selected for
neighbourhood searching (m), number of top-rated (elite) sites
among m selected sites (e), number of bees recruited for the
best e sites (nep), number of bees recruited for the other (me)
selected sites (nsp), and the stopping criterion for the loop.
At the initialization stage, a set of scout bee population (n) are
randomly selected to define the A clusters. The Euclidean
distances between each data pattern and all centres are
calculated to determine the cluster to which the data pattern
belongs. In this way, initial clusters can be constructed. After
the clusters have been formed, the original cluster centres are
replaced by the actual centroids of the clusters to define a
particular clustering solution (i.e. a bee). This initialization
process is applied each time new bees are to be created.
In step 2, the fitness computation process is carried out for
each site visited by a bee by calculating the clustering metric d
(equation 1) which is inversely related to fitness. Step 3, is the
main step of bee colony optimization, which start by forming
new population (step 3-a). In step 3-b, the m sites with the
highest fitness are designated as “selected sites” and chosen
for neighbourhood search. In steps 3-c and 3-d, the algorithm
conducts searches around the selected sites, assigning more
bees to search in the vicinity of the best e sites. Selection of
the best sites can be made directly according to the fitness
associated with them. Alternatively, the fitness values are used
to determine the probability of the sites being selected.
Searches in the neighbourhood of the best e sites - those which
represent the most promising solutions - are made more
detailed. As already mentioned, this is done by recruiting
more bees for the best e sites than for the other selected sites.
Together with scouting, this differentia] recruitment is a key
operation of the bee algorithm. In step 3-d, only the bee that
has found the site with the highest fitness (the “fittest” bee)
will be selected to form part of the next bee population. In
nature, there is no such a restriction. This restriction is
introduced here to reduce the number of points to be explored.
In step 3-e, the remaining bees in the population are assigned
randomly around the search space to scout for new potential
solutions. At the end of each loop, the colony will have two
stages to its new population: representatives from the selected
sites, and scout bees assigned to conduct random searches.
These steps are repeated until a stopping criterion is met.
4. EXPERIMENTAL INVESTIGATIONS
The airborne LIDAR data used in the experimental
investigations have been recorded with TopScan's Airborne
Laser Terrain Mapper system ALTM 1225 (TopScan, 2004).
The data are recorded in area of Rheine in Germany. Two
different patches with residential and industrial pattern were
selected to test the developed algorithm. The selected areas
were suitable for the evaluation of the proposed classification
strategy because the required complexities (e.g. proximities of
different objects e.g. building and tree) were available in the
image (figure 1-a, b). The pixel size of the range images is one
meter. This reflects the average density of the irregularly
recorded 3D points which is fairly close to one point per nr.
Intensity images for the first and last echo data have been also
recorded and the intention was to use them in the experimental
investigations, Figure 1 shows the details of the test data. The
impact of trees in the first and last echo images can be easily
recognized by comparing the two images of this figure.