International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004
give accurate classification results for training data but not for
unknown test data. Network structure, therefore, has a direct
impact on the generalisation capabilities of networks, that is,
their ability to recognise patterns that are not present within
the training set (Kavzoglu and Mather, 1999).
4. RESULTS AND DISCUSSION
As a result of a field study and visual interpretation of aerial
photography (1: 5000 scale) of the study area, eight major land
cover classes were decided, and fields were selected to collect
representative pixels for the classes to be used in classification
processes. Thus, a ground truth image containing a total of
10477 pixels was created. Table 3 shows total number of
pixels selected for each land cover type. As can be seen from
the table, it was difficult to collect sample pixels for some
classes such as grassland, inland water and bare soil classes. It
should be also noted that the selection of pixels for road class
was tedious and problematic as the width of the roads in the
study area is mainly smaller than the pixel size of 30 metres,
suggesting that these pixels are mostly mixed in nature.
Class Number of Pixels
Coniferous Forest 1667
Deciduous Forest 3633
Urban 1144
Inland Water 383
Grassland 389
Bare Soil 590
Road 1029
Sea 1642
Table 3. Number of pixels used for training and testing
In the formation of training, validation and test pattern files, an
in-house software developed by the second author of this paper
was employed. The program randomly selects. pixels from the
images by taking the ground truth image into account. It also
allows the user to decide minimum and maximum number of
pixels for each pattern file. For minimum 380 pixels were
selected whilst 1000 pixels for maximum. For all band
combinations considered in this study training files included
200 pixels for each class (1600 pixels in total), validation files
contained 40 pixels for each class, and testing files comprised
3550 pixels.
In order to test the effectiveness of addition of multi-temporal
and principal component bands, several combinations of the
image data were produced by stacking image layers. In addition
to the single date Landsat ETM- and Terra ASTER images,
combination of both images together with the principal
components were prepared and employed in the classification
stage. These combinations of the images are described in Table
4. In the table, abbreviations of PCAI1 and PCA2 represent the
images having the three principal components estimated for
Landsat ETM+ and Terra ASTER images, respectively.
Training, validation and test pattern files were produced for the
combinations given in Table 4. In order to estimate the number
of training samples, set the optimum rates for the learning
parameters and define the network structure (i.e. number of
hidden layer neurons), the guidelines suggested by Kavzoglu
and Mather (2003) were used.
Combination Image (band)
Cl Landsat ETM+ (6)
C2 Terra ASTER (9) TA
C3 Landsat ETM+ (6) + PCAI (3)
C4 Terra ASTER (9) + PCA2 (3)
CS Landsat ETM+ (6) + Terra ASTER (9)
C6 Landsat ETM+ (6) + Terra ASTER (9)
+ PCAI (3) + PCAZ (3)
Table 4. Image band combinations used in classification
According to these guidelines, weights in the network were
randomly initialised in the range of [-0.25, 0.25], learning rate
and momentum term were set to 0.2 and 0.5 respectively, and
the number of hidden layer nodes were estimated using the
following expression of Garson (1998);
Np Jr (N; * No)] (1)
where numbers of input and output layer nodes are represented
by N; and N, respectively, and the number of training
samples (or patterns) is represented by N, . The symbol r is
oO
a constant set by the noise level of the data. Typically, r is in
the range from 5 to 10. It should be noted that the Stuttgart
Neural Network Simulator (SNNS) developed at the Institute
for Parallel and Distributed High Performance Systems at the
Stuttgart University was chosen to implement the neural
network models created for each image combination. Training
processes for all network structures were controlled by taking
the error level for the validation data into consideration, which
is known as cross-validation — a robust stopping criterion for
training process. In other words, learning process is stopped
when the error on the validation set starts to rise. The
generalisation capabilities of the trained networks were tested.
using the test pattern file. The results for all combinations
including individual class accuracies were shown in Table 5.
The table also includes the results of the Maximum Likelihood
classification that is performed with exactly the same training
and test pixels. The classification accuracies were estimated in
terms of Kappa coefficient, which is a more realistic statistical
measure of accuracy than overall accuracy since it incorporates
the off-diagonal elements using row and column totals (i.e.
omission and commission errors) in addition to the diagonal
elements of the error matrix. Network column in the table
shows the network structures established for the corresponding
combination. For instance, 6-16-8 indicates 6 input nodes, 16
hidden nodes and 8 output nodes.
Combination | Network ANN ML
Cl 6-16-8 0.88958 0.84851
C2 9-20-8 0.92553 0.89012
C3 9-20-8 0.88837 0.84209 |
C4 12-16-8 0.91715 0.87410
C5 15-26-8 0.94943 0.90516
C6 - 21-25-8 0.95876 0.92051
Table 5. ANN and ML results for image band combinations
954
Inter
It is
for L
high:
princ
ineff
Whil
char:
com
signi
class
two
sligh
minc
accu
com!
reco
After
into
stud
the ?
give
ML
espe
were
parti
enco
to th
estin
resul
class
uppe
ML «