-B8, 2012
number
198
33
33
2
6
re) 22
; 190
SE
484
es per set.
e classification
Structural types
al to become a
ndscapes series
ach subdivided
neters) surface
or, fraction of
class, surface
enic heat flux)
ate. Since the
hern American
d scheme was
reference areas
high resolution
egories. Urban
mer city with
spires like bell
nto the classes
- buildings of
and Terraced
n rows. Blocks
orm geometric
ises high rise
15) consists of
| of greening
The industrial
industrial or
id Port (port)
rage facilities
t and gardens
nd agricultural
elevant for the
measurement
mburg during
til the 29" of
hn Hamburg
r temperature
5 buses were
of Driesen &
onsiveness to
to the fast
K311 loggers
ors, RFT325
shields. The
GPS-loggers
by Variotek.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B8, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
They were contained in a waterproof OtterBox 3000 case and
mounted magnetically on the front roof, where the temperature
influence of the bus itself was smallest (tested with surface-
temperature-sensors at three positions of the roof). This
construction allowed for continuous measuring for five to six
days (5 seconds intervals for temperature and 20 meters
intervals for position and velocity).
Since the air temperature measurements can be contaminated by
the roof temperature of the bus at low travelling speeds, data
collected at velocities lower than 12 km/h were discarded in the
post-processing. Then, the temperature data were linearly
interpolated to a frequency of one second and matched with the
according time stamps of the GPS data. To reduce the massive
dataset, the single measurements were then averaged to one
minute intervals and aggregated to an network of virtual
stations with approximately 100m spacing (derived from the
centers of all measurements within a regular 100 m-grid).
Subsequently, the data were transferred to a
PostgreSQL / PostGIS database and validated with data of 25
stationary measurement sites from various sources. The
comparison of the mobile measurements with near stationary
measurements (distance « 130m, +/- 2.5 minutes, n=108)
revealed a satisfactory quality of the collected data with a mean
difference between stationary and mobile measurements of
-0.15 K and a mean absolute error of 0.51 K. The UHI was then
calculated as the difference to stationary measurements from
the Hamburg Weather Mast operated by the Meteorological
Institute of the University of Hamburg. Although this data is
likely to contain some urban effects, the offset to the ‘real’ UHI
could be neglected for this study.
To guarantee a minimum of comparability, ‘stations’ with less
than 30 individual measurements were excluded from the
subsequent analysis. For the remaining 1260 virtual stations the
mean UHI was calculated from all individual measurements.
The UHI data are shown in Figure 1.
ndustr ;.
IN mogcoe -
terrace |.
urbdens =
urbcore :
water
forest
Figure 1. Mean UHI data from the mobile measurement
campaign with public transportation buses. Circle size indicates
the number of measurements.
3. METHODS
3.1 Feature selection
For feature selection the Minimum Redundancy Maximal
Relevance approach (MRMR) approach was chosen, which was
originally developed in bioinformatics for genome classification
(Peng et al, 2005). The algorithm selects features that have
both high relevance for classification of the target classes and
low redundancy with the prior selected features; the distance
between two features is defined by their mutual information.
For this study the Mutual Information Quotient (MIQ) criterion
was used.
pG;. y,) ; (1)
Ix. v) > v,)1
(x, y) X y,)log ges
where I- mutual information
X, y 7 features
p(x, y) = joint probabilistic distribution
p(x), p(y) = marginal probabilities
3.2 Classifiers
Six supervised classifiers (implemented in the Waikato
environment for knowledge analysis data mining package;
Bouckaert et al., 2009) were used in this study.
The Naive Bayes (NB) classifier assumes conditional
independency, which reduces the posterior probability of class
membership to the product of the estimate of the features’
marginal probabilities. Despite its simplicity it often delivers
good results. The Support Vector Machine (SVM) classifier
transfers the problem to pairwise classification in a higher
dimensional space (Burges, 1998). The Multilayer Perceptron
classifier is a feedforward artificial neural network (NN)
composed by nodes (neurons) in connected layers. It is trained
by a backpropagation algorithm. The Random Forest (RF)
classifier utilises a number of tree-structured classifiers as
committee to decide with majority and shows excellent
classification performance and computing efficiency. The
single trees are each generated from a random subset and
therefore an ‘out of bag’ error can be estimated without any
bias. The number of trees grown and the number of features
used for each tree were varied and three configurations were
tested. While RF1 worked with 10 trees, RF2 used 30 trees and
20 features, and RF3 50 trees and 30 features (see Bechtel and
Daneke, 2012 for more detailed information).
3.3 Empirical models
For the evaluation of the UHI data two different empirical
models were used. For the Linear Regression (LR) model
attributes were selected by the M5 method and collinear
attributes were eliminated. The Multilayer Perceptron is again a
neuronal networks trained by a backpropagation algorithm, but
this time predicts a numerical value instead of class
membership probabilities.
4. RESULTS
4.1 Classification of LCZ
Table 2 shows the classification results for classifiers and
feature sets. All numbers refer to the overall accuracy evaluated