In: Wagner W., Szekely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, IAPRS, Vol. XXXVIII, Part 7B
temporal partitioning of Earth AOD through the competition of
gating regression models (Radosavljevic et al, 2008). To
address the spatio-temporal dependence the algorithm takes
information about location and time of data points as inputs for
gating function and performs competition among specialized
predictors for each point in the dataset. It starts by randomly
dividing the dataset into two disjoint subsets. A specialized
predictor is then trained on each subset. Iteratively data are
reassigned with some weight to each predictor. Weight is
determined based on gating output and accuracy of regression
models. Predictors and gating network are then retrained taking
into account new assignment.
2.2 AOD Retrieval across Multiple Accuracy Measures
Well known accuracy measures such as Mean Squared Error
(MSE) are often not informative enough because (1) retrieval
error increases with AOD, (2) distribution of AOD is skewed
towards small values, and (3) there are many outliers. Instead,
domain scientists use an array of accuracy measures to gain
better insight into the retrieval accuracy. For example, the Mean
Squared Relative Error (MSRE) makes larger absolute errors
more tolerable when predicting large AOD than when
predicting small AOD. Ideally, one would like to have a
retrieval algorithm that provides good accuracy with respect to
these alternative accuracy measures.
To address this issue we considered training of neural networks
that minimize MSRE instead of MSE. In order to construct a
predictor that is also accurate with respect to MSE and several
other accuracy measures, we proposed an approach that builds
an ensemble of neural networks, each trained with slightly
different MSRE measure (Radosavljevic et al, 2010a). The
outputs of the ensemble are then used as inputs to a meta-level
neural network that produces the actual AOD predictions.
2.3 Uncertainty Analysis of AOD Retrieval
In this task our objective was to explore if neural networks can
provide estimates about retrieval uncertainty in addition to
providing accurate retrievals. Uncertainty estimation for the
confidence of retrieval requires modeling of the whole
conditional distribution of the target variable. A standard
approaches for neural network uncertainty estimation assume
constant noise variance. However, this assumption is not valid
for AOD retrieval where noise is heteroscedastic (variance of
noise is input-dependent). This is why we explored the
Bayesian approach for uncertainty estimation, based on the
previous work by Bishop and Quazaz. We also considered
alternatives based on the bootstrap technique that are more
tractable for large data sets.
A neural network-based regression assumes that target y is
related to input vector x by stochastic and deterministic
components. The stochastic component is a random variation of
target values around its mean caused by heteroscedastic noise
with zero-mean Gaussian distribution and input-dependent
variance. The deterministic component determines a functional
relationship between attributes and prediction. Our goal was to
estimate both the stochastic and deterministic component as
good as possible.
In (Ristovski et al, 2009) we have evaluated three approaches
for estimating the stochastic component. The first was based on
training a neural network to predict squared error from
attributes. We used a standard Mean Squared Error (MSE)
criterion to train this network. The second approach assumed
heteroscedastic noise and defined the conditional target
distribution. The uncertainty estimation neural network is
obtained by maximizing the corresponding log-likelihood. The
second method assumes that the conditional mean is exactly
estimated by the bootstrap committee. Since this is only an
estimate, in the third approach we also considered the model
uncertainty. In this approach error occurs due to both
uncertainty in the model and noise in target.
2.4 Selection of Sites for Ground Based Observations
Ground based AOD stations are often located without a
rigorous statistical design. Decisions are typically based on
practical circumstances (e.g. overrepresentation in urban
regions and industrialized nations) and according to domain
experts’ assumptions about the importance of specific sites.
Given these circumstances, our aim was to evaluate
performance of the current design of AERONET sensor
network and to apply data mining techniques to assist in future
modifications of the sensor network.
In (Radosavljevic et al, 2009) we assumed that there is a
pending budget cut for maintenance of the existing AERONET
sites. The objective was to remove a fraction of the AERONET
sites while making sure that the utility of the remaining sites is
as high as possible. We made a simplifying assumption that
operational costs for each AERONET site around the globe are
equal. Common to most selection techniques originating from
the spatial statistics is a tendency to overlook the time
dimension of data collected by the sensor network. Therefore,
we considered series of observations and proposed to optimize
AERONET sensor selection based on the concept of retrieval
accuracy. Each AERONET site provides a time series which we
used for training a regression model to retrieve future AOD.
Sites that can be removed are those whose observations are best
predicted by the model built on data from the remaining sites.
In (Das et al, 2009) our objective was to determine appropriate
locations for the next set of ground-based data collection sites
as to maximize accuracy of AOD prediction. Ideally, a new site
should capture the most significant unseen aerosol patterns and
should be least correlated with the previously observed patterns.
We proposed achieving this aim by selecting the locations on
which the existing prediction model is most uncertain. Several
criteria were considered for site selection, including uncertainty,
spatial diversity, temporal similarity, and their combination.
Spatial diversity selects sites that are farthest away from the
existing sites. The traditional approach in active learning is to
label the most uncertain data points. In our application, instead
of selecting an individual data point, we select a site. To address
this, we defined uncertainty of a site as the average uncertainty
over all its observations. For this purpose, we trained a number
of neural networks on data obtained from the existing
AERONET sites using the bootstrap method. Then we used
these neural networks to predict the value of AOD at all satellite
observations over potential AERONET sites. We measured the
variance among the network predictions and considered this
variance as the uncertainty of prediction at the individual data-
points. The selected sites are those with the highest measured
uncertainty. One drawback of the site uncertainty selection is
that a global measure like average uncertainty might fail to
compare the similarity in temporal variation of the uncertainty
among sites. Each of the potential sites can be regarded as a