George Vosselman
3.2 Preserving important terrain features
In most cases it will be difficult to specify a filter function in terms of parameters (as above). An alternative is to derive
the shape characteristics of the terrain from a training sample. This part of the data should contain the important terrain
features that should be preserved by the filter and should only consist of ground points. If one can specify such an area,
the points in this area can be used to empirically derive the maximum height differences as a function of the distance
between two points.
Clearly, the determined maximum height differences are stochastic. Before using these values to filter the points in
other areas, one should again add a confidence interval for the maxima. Suppose the training sample contains N point
pairs at a distance d. The maximum height difference at this distance can be considered to be the maximum of N
drawings out of the height differences of the complete dataset. Unfortunately, the probability distribution of height
differences in the complete dataset is not known. To get some idea about the standard deviation of the maximum height
difference, we therefore make use of the known distribution of height differences in the training sample.
Let F(Ah) be the cumulative probability distribution for the height differences between points at distance d in the
training data. Then, Fmax(Ah) = F(AA)" is the cumulative probability distribution for the maximum of N independent
drawings of a height difference. The probability density function of the maximum becomes
OF. (Al) na GFA NEI
f (Ah)y=—"2%2—"""=NFA mre = N F(A f(A} 10
max (AR) 3 AL (Ah) 5 Ah (Ah) f(Ah) (10)
The variance of the maximum can easily be obtained by taking integrals over this function. For each distance d the
variance needs to be computed independently. From the variance a confidence interval can be derived that should be
added to the maximum height difference that was encountered for distance d.
3.3 Minimising classification errors
Whereas the above filter function makes sure that important terrain features are maintained in the digital elevation
model, it may be quite liberal in accepting points that are not on the ground. IL.e., the number of type I errors will be low,
but the number of type II errors may become large.
Suppose, a dataset contains three points on a line, as in figure !, and that one would
use linear interpolation between points to determine the heights at other points. If
point 2 would be rejected, but would in fact be a ground point (type I error), the
height error in the resulting DEM at the position of point 2 is h, - hy. If, on the other
hand, point 2 would be accepted as a ground point, but would in fact be a point on a
building (type II error), the height error in the DEM at the position of point 2 is h»-h».
Hence, the absolute size of a height error caused by a wrong classification is the same
for type I and type II errors. Since the effect of these errors is the same, a point p, can best be classified as a ground
point if P(p, € DEM) » P(p; € DEM). The break even point lies at P(p; € DEM) - 0.5. Knowing the height of a ground
point p; at distance d from point p;, one would like to determine the height difference Ah between the points, such that
P(p; € DEM | Ah, d, p; € DEM) - 0.5. Using probabilities derived from frequency counts of height differences between
point pairs in a training set of ground points and between point pairs with one point from the training set of ground
points and the other point from the training set of unfiltered data of the same area, one can calculate
Figure 1. DEM error.
P(p, € DEM ,Ah,d, p, € DEM)
P(Ah,d,p, e DEM)
P(Ahld,p,e DEM,p,€ DEM) P(p, e DEM ld,p;e DEM)
T P(Ahld, p, c DEM)
P(p; € DEM |Ah,d,p, e DEM) =
(11)
for each height difference Ah and distance d. Both Ah and d need to be discrete for this purpose. For each d, one can
now determine the Ah for which P(p; € DEM | Ah, d, p; € DEM) = 0.5. These values Ah can be taken as the maximum
height differences that are allowed in the filtered data in order to minimise the number of classification errors.
938 International Archives of Photogrammetry and Remote Sensing. Vol. XXXIII, Part B3. Amsterdam 2000.