has
tral
ion
tral
be
est
me
za
few
ing
of
the
ilar
cels
ion
ean
ces
ters
| of
uch
be
and
ium
and
the
Cis
E).
SB)
SES
the
that
) its
ses.
ium
the
thin
um
1ser
ICES
eter
ixel
1 of
Petrie, Gregg
the results, the hyperspectral bands corresponding to the Landsat 7 band centers were selected, using 172 pixels
randomly selected across the hyperspectral image.
Joining tree clustering of the Landsat subset of the Whitewater region resulted in 43 classes at 1096 and 22 classes at
15% of the maximum linkage distance. Analysis of 22 classes yielded over 40% of the classes with less than five
observations; thus, 22 was the maximum number of classes evaluated with k-means clustering. The maximum
F-statistic was achieved at 3 classes; however 12 classes met all the criteria (Figure 3). From the diagnostic plot, it is
apparent that analysis of a number of classes greater than 12 resulted in a weighted F-statistic that was over 20%
smaller than that achieved with three classes. Also, we can see that an analysis of a number of classes greater than 19
yielded too many classes with less than five observations. Thus, 12 classes were chosen as the optimum number of
classes for identification. The same analysis was applied to the hyperspectral imagery, resulting in 17 classes being
chosen as the optimum number of class sizes for identification. The higher spatial resolution did yield a greater
optimum number of classes. This result is likely due to the smaller pixel size of hyperspectral data, which includes
greater variability in spectral response (less smoothing) than found in the larger pixel size imagery.
Landsat Whitewater Data
0 5 10 15 20 25
Number of Classes
—e— Percent Difference from Maximum F
—&— Percent out of 22 Classes with « 5 Obs
Maximum Distance
75th Percentile of Distance for 2 Classes
Figure 3. Diagnostic plot to determine the optimum number of classes using the seven bands from 200 pixels randomly
sampled from the Landsat footprint of the Whitewater Region. For this example, a k value of 12 (12 classes) met all of
the criteria.
The protocol developed to determine the number of classes for identification is not limited to image data, and thus can
provide the basis for the optimal physical stratification of the landscape with Geographic Information System layers.
This is an important aspect of our strategy for effectively exploiting Landsat and other imagery. Further, the protocol
also aids the collection of field data by supporting the optimal classification of a landscape into homogenous blocks for
stratified random sampling. This protocol may have applications for supervised classification as well, by guiding the
user to select the optimum number of training sets.
International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B7. Amsterdam 2000. 1147