The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part Bl. Beijing 2008
assumption that ignores the heterogeneity of real trees. It is not
clear how such an assumption would affect derived results,
especially when parameters are derived using the same
assumptions used to create the forests (Widlowski et al, 2005).
Figure 2 shows a simulated waveform over a single thirty year
old Sitka spruce tree model. The heterogeneity and subterranean
echoes caused by multiple scattering are apparent. The ground
is at a range of 1,200m.
Figure 2. Simulated waveform from a single Sitka spruce tree.
Simulations were run with a resolution of 25cm (which can be
coarsened for analysis), a range of wavelengths (including
532nm, 850nm, 1064nm, 1650nm and 2060nm), a 30m ground
footprint (the optimum for forestry, Zwally et al, 2002) and with
and without a temporal laser pulse (100ns is proposed for A-
scope). Gaussian noise can be added before analysis. The width
of the Gaussian is defined as a percentage of the maximum
signal return. The methods will be developed for an
infinitesimally short pulse before the extra complication of
deconvolution is added. Systems such as GLAS have short
pulses (around 2ns) which should not need any deconvolution
unless the range sampling is significantly finer.
Derivation of parameters
Estimation of forest parameters from lidar relies on the ground
returns being distinguishable from the canopy returns (Hofton et
al, 2002). This can either be achieved with multiple first return
scans (Koetz et al, 2007) or a single full waveform measurement
(Zwally et al, 2002) to get a distribution of returns from
throughout the canopy. Due to the speed of spacebome
platforms and the subsequent sparsity of sampling only full
waveform lidar is suitable for measuring vegetation from space.
The standard method is to decompose the waveform into a set
of Gaussians by non-linear regression (Hofton et al, 2000). The
distribution of Gaussians can be used to classify cover type
(Wagner et al, 2008, Reitberger et al, 2008) and (taking into
account relative cross sections) can be used to derive vegetation
height (Blair et al, 1999), estimate canopy cover (Lefsky et al,
2005) and through metrics derive other biophysical parameters
such as leaf biomass and leaf area index (Lefsky et al, 1999).
Often there is no clear separation of ground and canopy returns,
either due to dense understory, small separation of canopy and
ground or topography. Attempts have been made to improve
height estimates in these situations by using another data source
to estimate the ground position (Rosette et al, 2008). Care must
be taken that these ground elevation datasets give the true
height (for example SRTM saturates over forests). Accurate
datasets are not available globally.
Method
This investigation explores methods that use only the waveform
to estimate tree height (which can be linked to biomass through
allometric relationships and stand counts). Other characteristics
would need to be inferred with additional information and will
not be investigated in this paper. Fusing lidar with hyper-
spectral and multi-angular data would greatly help in the
derivation of these biophysical parameters however the lidar
waveform alone should provide the best height profile.
Before Gaussians are fitted to the simulated signal it is pre-
processed in the following order;
• 5% Gaussian noise was added, as described above.
• The signal was pre-smoothed by convolution with a
3m Gaussian.
• Noise statistics are calculated from a known empty
portion of signal (above canopy to avoid echoes).
• The signal was de-noised by subtracting a threshold of
the mean noise plus (an arbitrary) three standard
deviations
• The signal was post-smoothed with a lm Gaussian.
The empty tails are cropped from the signal to constrain the
Gaussian decomposition. The positions and amplitudes of all
turning points are recorded along with the width of peaks. If
more features than the number of Gaussians to be fitted are
found (due to heterogeneous or noisy signals) the Gaussians
with the largest cross sections are used first. If too few are
found (skewed Gaussians for example) the extra Gaussians are
evenly spaced in the gaps. An implementation of the
Levenberg-Marquardt method was used to minimise the root
mean square difference between the fitted Gaussians and
original signal (Press et al, 1994). It has been found that the
best fits are achieved when the x and y axes are rescaled to
between 0 and 100.
The fitted Gaussians and the pre-processed signal were analysed
to derive biophysical parameters. The centre of the last
Gaussian is taken as the ground position if the energy contained
within is more than an arbitrary percentage (1%) of the total
energy. This should avoid any Gaussians fitted to noise or
subterranean echoes caused by multiple scattering. If the
Levenberg-Marquardt method fails to find a solution or the
derived parameters are unrealistic the fitting is repeated with
one less Gaussian.
The tree top is calculated form the pre-processed signal. Taking
it as the point at which the signal rises above the noise threshold
will always lead to an underestimate of height. Data
assimilation schemes such as the Kalman filter rely on unbiased
observations (Williams et al, 2005). For this reason it may be
better to try to estimate a point that could be an over or under
estimate. The first point at which the signal drops to the mean
noise level before it rises above the noise threshold would seem
to be a sensible, unbiased estimate of tree top position. Figure 3
shows a histogram of the signal start position error with and
without tracking back from the noise threshold to the mean
noise value. One hundred simulated waveforms were used with
ten thousand separate sets of random noise added to give one
million estimates. A negative error means a premature signal
start; this was common in both methods.