For other cases a grid search algorithm with multi-fold cross-
validation was used.
GA techniques
The concept of the GA method is based on the natural selection
process, which has its roots in biological evolution. At the
beginning the set of features are generated randomly as a
population. In next stages, the individuals are selected as
‘parents’ at random from the current population, and produce
‘children’ for the next generation. The GA gradually modifies
the population toward an optimal solution based on the fitness
function and operations such as selection, crossover and
mutation. The application of GA model involves designing of
the chromosomes, the fitness function and architecture of the
system. The chromosome is usually an array of bits which
represents the individuals in the population. An objectives
function play important role in the GA method, it is designed
and utilized to evaluate the qualities of candidate subsets.
In this paper, we proposed a fitness function which made uses
of classification accuracy, number of selected features and
average correlation within selected features.
Fitness = Won X100 ur S Cor (2)
OA N
where OA = overall classification accuracy (90)
Woa = weight for the classification accuracy,
We = weight for the number of selected features
Ns = number of selected features
N = the total number of input features.
Cor = average correlation coefficients of selected bands
The values of Wo4 and Wg were set within 0.65-0.8 and 0.2-
0.35, respectively. The other parameters for the GÀ were:
Population size = 20-40; Number of generations = 200;
Crossover rate: 0.8; Elite count: 3-6; Mutation rate: 0.05.
Firstly, the GA was implemented for each combined datasets
using the SVMs, ANN and SOM classifiers. These processes
will give the classification results of each classifier with
corresponding optimal datasets and parameters. After that the
classification results were combined using Dempster —Shafer
theory. The commonly used Majority Voting (MV) algorithm
was also implemented for comparison.
Six land cover classes, namely Native Forest (NF), Natural
Pastures (NP), Sown Pastures (SP), Urban Areas (UB), Rural
Residential (RU) and Water Surfaces (WS) were identified for
classification.
The data used for training and validation were derived from
visual interpretation and old land use map with the help of
Google Earth images. The training and test data were selected
randomly and independently.
4. RESULTS AND DISCUSSIONS
The overall classification accuracy for the SVM, ANN and
SOM classifier over different datasets using feature selection
and non-feature selection approach is summarised in the table 3.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
Overall classification accuracy (%)
Datasets Non-FS FS-GA
SVM |ANN |SOM [SVM |ANN |SOM
1 59.26 [59.39 [56.06 159.94 |61.15 |57.41
2 79.01 [75.49 [79.97 |81.06 |80.19 |80.37
3 81.47 |80.99 |80.03 [82.37 |81.28 |80.74
4 82.78 |81.70 |78.84 [85.22 |82.77 |81.54
Table 3. Comparison of classification performance
between FS-GA and non-FS approach.
The classification results illustrated the efficiencies of the
synergistic uses of multi-date optical and SAR images. Both
non-FS and FS with GA (FS-GA) methods gave significant
increase in classification accuracy while the combined datasets
(3 and 4™) were applied.
As for the non-FS approach the combined multi-date Landsat 5
TM+ and SAR data increase overall accuracy by 2.46% and
22.1% for SVM, 5.5% and 21.6% for ANN and 0.06% and
23.97% for SOM compared to the cases that only multi-date
Landsat 5 TM+ or SAR images was used. These improvements
were even more significant while the FS method were applied.
Textural information and NDVI are valuable data for land cover
classification. In most of cases, the integration of these data
enhances classification results noticeably. For instances, with
the FS-GA approach, the classification of a combination of
original optical and SAR images with their textural and NDVI
data (4? dataset) gave increases of overall accuracy by 2.85%,
1.49% and 0.80% for SVM, ANN and SOM classifiers,
respectively.
It is clearly that, the FS-GA approach performed better than the
traditional non-FS approach. For all of datasets and classifiers
that have been evaluated, the FS-GA approach gave significant
improvements in the classification accuracy. The increases of
overall classification accuracy ranging from 0.29% (ANN
classifier with the 3™ dataset) to 2.70% (SOM classifier with the
4™ dataset). The highest accuracy of 85.22% was achieved by
the integration of FS methods with SVM classifier for the 4%
dataset. It is worth mentioning that the FS-GA approach used
much less input features than the traditional method. For
instances, in a case of the SVM classifier and the 4'^ dataset
only 68 out of 173 features were selected. As was mentioned
early in this paper, the increase of data volume does not
necessary increase the classification accuracy. In a non-FS
method, the accuracy of classification using SOM algorithm for
the 4" dataset was actually reduced by 1.19% compared to the
case of 3" dataset. However, this problem does not happen
while applying the FS-GA technique. In this case, the accuracy
of SOM algorithm slightly increased by 0.80%.
The Figure 4 below showed the results of classification using
the FS-GA techniques with the SVM classifier which gave the
best accuracy among single classifier for the 4" dataset.