DA is a technique, which discriminates among k classes
(objects) based on a set of independent or predictor variables.
The objectives of DA are to (1) find linear composites of n
independent variables which maximize among-groups to
within-groups variability; (2) test if the group centroids of the k
dependent classes are different; (3) determine which of the n
independent variables contribute significantly to class
discrimination; and (4) assign unclassified or “new”
observations to one of k classes (Lowell, 1991). The variates
for a discriminant analysis, also known as the discriminant
function takes the following form:
Y, 2a*fiXy t foXot...t fX, (4)
where
Y; = discriminant Y score of discriminant function j
for object (class) k
a = intercept
J = discriminant weight for independent variable i
X5, 2 independent variable i object (class) k
3.5 Model validation
Model validation (evaluation) can be done by split-sample
validation, as mentioned previously. For each model, predict
the response of the remaining data, and calculate the error from
the predictions and the observed values (De'ath and Fabricius,
2000). We also used overall accuracy and kappa coefficient to
assess models, because overall accuracy only include the data
along the major diagonal and excludes the errors of omission
and commission, kappa incorporates the non-diagonal elements
of the error matrix as a product of the row and column marginal
(Lillesand et al., 2008).
4. Results and Discussion
For the base models shown in Table 1, the accuracy of
MAXENT (kappa value 0.84) was the best in SS-1, followed
by GLM (0.7) and GARP (0.6), and DA (0.55) was the worst.
The kappa values of non-parametric algorithms, MAXENT
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
(0.46) and GARP (0.12) in SS-2, dropped sharply, while
parametric GLM (0.7) and DA (0.55) dropped slightly in SS-2
as tested by independent samples from the Kuandaushan-trail,
with 076 km away from aforementioned two training sites in
Huisun. For the first data-merged models in SS-3, the kappa
values of four models lifted back to almost the same values as
those in SS-1 from SS-2 or even better, and the four models
still kept the same order in accuracy as that in SS-1. As the
first data-merged models built in SS-3 were applied to a larger
area in SS-4 including Tong-Mao Mountain, with 10 km away
from the three sites at Huisun, the kappa values of MAXENT
and DA declined to near zero, as well as GARP and GLM
could not work possibly due to a limit on the size of data layer,
a big difference in the domain values of predictor variables
between Huisun and Tong Mao, or some other possible
unknown factors which we will figure out later. In contrast,
the kappa value of MAXENT in SS-5 rebounded strikingly as
the second data-merged models built in SS-5 were applied to
the same area as that in SS-4 (Table 2), while that of DA rose
back slightly. Consequently, it was unlikely to accurately
extend spatial patterns of CFs from the Huisun area to
Tong-Mao Mountain area with 10 km gap or to the entire study
area encompassing Huisun by predictive models merely based
on topographic (indirectly operating) variables.
The models, either base models in SS-1 or the first data-merged
models in SS-3, accurately predicted the potential habitats of
CFs in Huisun, and substantially reduced the area of field
survey to less than 10% of the entire study area, even less than
2.5% with MAXENT (Tables 3 and 4 and Figure 2). In
Huisun study area, all the potential CF habitats predicted
occurred in the Kuan-Dau watershed, and none occurred in the
Tong-Feng watershed because of remarkable differences in
humidity and solar illumination between them. The outcome
had been proved true by field surveys through which almost no
cycad-ferns were found in the Tong-Feng watershed. In
contrast, neither the first data-merged models in SS-3 nor the
second data-merged models in SS-5 could not accurately
extrapolated CF spatial patterns when they were applied to the
larger area encompassing Ton Mao Mountain. Consequently,
they could not reduce the area of field survey to less than 10%
of the entire study area, even greater than 25% with DA (Tables
5 and 6 and Figure 3).
Class MAXENT GARP GLM DA
SS] SS) SS3 SS] SS? SS3 SSL SS2 SS3 .SS|I SS2 M SS3
Training Overall (%) 97 97 96 88 88 95 95 95 95 86 86 87
Kappa .89 .89 38 62 ‚62 27 .83 .83 85 63 .63 68
Test Overall (%) 95 90 95 88 78 91 96 92 92 85 84 85
Kappa S4 46 2% 60 12 77 J0 .70 77. .00 .55 .62
Table 1 SS-1, SS-2, SS-3: the accuracies of the models with elevation, slope, and terrain position variables for predicting
the potential habitat of CFs.
Class MAXENT GARP GLM DA
SS4 SSS SS4 SSS Ss4 SS5 SS4 SS5
Training Overall (26) 86 91 — zz — — 70 69
Kappa ‚64 15 = m» — i 34 33
Test Overall (%) 82 91 — — — 64 66
Kappa .06 73 — — — a. .05 0.25
Table 2 SS-4 and SS—5: the accuracies of the models with elevation, slope, and aspect variables for predicting
the potential habitat of CFs.