3. MISCLASSIFICATION IN REMOTE SENSING
(5)
A portion of deviation will be caused by
errors in measuring the status (e.g., proportion
forested) of sample unit i.
3.1 Misclassification error model
Let a measurement or calibration model for the
unknown true status Xi of sample unit i be
Xi = Ht Yi + Ho (1-Fi), (2)
where Fi is the imperfect remotely sensed
estimate of proportion forest in sample unit i,
and Ht is the known conditional probability that
a point is truly forest given that it is
classified as forest by the remote sensing
process. Similarly, (1-Fi) is the remotely
sensed proportion measurement of other cover, and
Ho is the conditional probability that a point is
truly forest given it is classified as other
cover by the remote sensing process. When
classification accuracy of the remote sensing
process is high, Ht will nearly equal 1, and Ho
will nearly equal 0.
3.2 Misclassification bias
The remotely sensed estimate Yi is a biased
estimate of true status Xi of sample unit i if
classification errors occur. Solving (2) for Fi ,
Yi = (Xi - Ho) / (Ht -Ho). (3)
The remotely sensed estimate Yi in equation (3)
will not equal the true status Ai unless Ht
equals 1 and Ho equals 0, i.e., perfect
classification accuracy.
3.3 Estimation of calibration model
In practice, the values of Ht and Ho are not
perfectly known. Rather, Ht and Ho are assumed
the same for all sample units in the stratum, and
their values are estimated using a finite sample
of reference points for which the remotely sensed
and true status are known. For example,
reference points might be available for M
systematically located 0.4 ha forest inventory
plots, which are measured in the field by USDA
Forest Service crews, where the field
classification is considered to be without error.
Under certain conditions, the location of these
field plots can be accurately registered to
remotely sensed images, so that both remotely
sensed and true classifications are available for
a small sample of point plots. This would
provide the necessary sample of reference points
to make estimates (Ht and Ho) of the true
conditional probabilities (Ht and Ho).
Consider the statistical sampling model:
Ht - Ht + Jf , Ho - Ho + Jo , (4)
where Jt and Jo are random variables that equal
the differences between the true and estimated
conditional probabilities. Ht might be estimated
from the Mt 0.4 ha Forest Service plots
classified as forest using remote sensing
equation (5):
Ht = [(Wf)i + (Ht) 2 + ... + (Ht)nt]/Mt,
where (Ht)i = 1 if 0.4 ha Forest Service plot i
is truly forest given it is classified as forest
using the remote sensing procedure, and (Ht)i = 0
otherwise. Similarly, Ho might be estimated from
the Mo 0.4 ha Forest Service plots that are
classified as other cover using the remote
sensing procedure,
Ho = [(/fc)l + (Ho) 2 + ... + (Ho)no]/Mo, (6)
where (Ho)i = 1 if 0.4 ha plot i is truly forest
given it is classified as other cover using
remote sensing, and (Ho)i = 0 otherwise.
The following is an estimate (Xi ) of the status
of sample unit i, using the estimated conditional
probabilities (Ht and Ho) of correct and
incorrect remotely sensed classifications from
(5) and (6), and the known remotely sensed status
}'i (Tenenbein 1972):
Xi = Ht Yi + Ho (l-Yi). (7)
3.4 Variance of calibrated estimate
From equations (2), (4), and (7), the unbiased
estimate of the true status Xi of sample unit
i, given the imperfect remotely sensed
measurement Yi of the same sample unit, is
Xi = (Ht + Jt) Yi + (Ho + Jo) (1-tt),
= [Ht Yi + Ho (1 - Fi ) ] + [Jt Yi + Jo (1 - Fi) ],
= Xi + [Jf Fi + Jo (1-Fi ) ]. (8)
The estimate Xi of the status of sample unit i in
(8) contains uncertainty propagated from the
imperfect model for classification error. Since
the estimated conditional probabilities (Ht,Jt)
are assumed unbiased, E[Jf] = E[Jo] = 0, and
measurement Fi is a known nonrandom constant,
then the variance of the estimate in (8) is
var(Ai) = [Jf Fi + Jo (1-Fi ) ] 2 ,
= E[Jf 2 ] Fi 2 + E [ Jo 2 ] (1-Fi) 2 ,
= var(Ht) Fi 2 + var(/jb) (1-Fi) 2 . (9)
If it i8 assumed that there are no registration
errors between field points and the remote
sensing imagery, then the random errors Jf and Jo
are caused solely by sampling error. The
sampling variances var (Ht) and var (Ho) can be
estimated from the simple randomized sample of
M=Mt+H> plots using the binomial distribution:
\&r(Ht) -
Ht
(l -Ht)
/
Mt,
(10)
var (Ho) =
Ho
(1 -Ho)
/
Mo.
(11)
371