124
the construction of confidence limits. Subsequent work has
attempted to revise this technique.
Later methodological articles criticize the use of the
normal approximation suggested by Hord and Brooner. Beyond
the call for appropriate discrete distributions, such as the
binomial, to construct confidence intervals, the recurrent
issue is sampling design. Van Genderen and Lock (1978)
suggest a procedure for picking sample sizes based on the
number of errorless results needed to support assertions of
confidence. Ginevan (1979) points out that errorless
results are not the only ones which might support a given
level of accuracy. The goal is to minimize sample size
while minimizing both Type I and II errors.
This line of research has refined statistical hypotheses
about map accuracy, but these concerns do not coincide with
most application needs. Hay (1979) provides a thorough
criticism of the normal approximation for confidence
intervals of proportions, but considers the revised method
as only a starting point. The total proportion correct
addresses merely the first of his five questions:
I What proportion of decisions are correct?
II What proportion of allocations to a category are correct?
III What proportion of a true category is correctly
allocated?
IV Is a category overestimated or underestimated?
V Are errors randomly distributed? (Hay,1979,p.529)
Hay's questions lead towards the methods developed below.
The use of percentage correct has been thoroughly criticized
also by Turk (1979). He demonstrates that figures of
percentage correct are inflated, even a random process would
be expected to achieve a positive value. Turk's alternative
involves the use of much more sophisticated statistical
estimation discussed below. Alternatives that address bias
correction problems have been developed in the crop
estimation literature (such as Bauer and others, 1978), but
they deserve broader application.
CRITIQUE OF A SAMPLE APPLICATION
In order to demonstrate the problems of the standard
percentage correct approach, a sample application is needed.
Attention will focus on an article by Todd, Gehring and
Haman (1980) [hereafter TGH] in which they assess the
accuracy of Landsat for mapping a wild area for the National
Park Service. The choice of this study is fortuitous and
does not imply that it is worse or better than many others.
The TGH project began by developing a ten class map of the
Shivwits Plateau by means of accepted remote sensing tools.
The ten class map resulting was labelled with particular
names combining physiographic features with vegetation type
and degree of cover, but these names are tags assigned by
human operators, not the basis for the classification. No
matter what the pure or unpure multispectral nature of these
classes, the use of the classification hinges on the