In: Wagner W., Szekely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, IAPRS, Vol. XXXVIII, Part 7B
553
composites for visual analysis (flood and standing water
absorbs infrared wavelengths of energy and appears as
blue/black in the RGB composite imagery), water body
identification in AVHRR imagery evolved from qualitative
visual interpretation to automatic quantitative extraction. The
reflectance of AVHRR channel 2 (0.73-1.1 pm, similar to MSS
band 7), the reflectance difference (CH2-CH1) and ratio
(CH2/CH1) between channel 2 and 1 (0.58-0.68 pm, similar to
MSS band 5) are used to discriminate water from land if these
parameters are less than the threshold values.
Domenikiotis et al. (2003) tried to use surface temperature to
discriminate water from land surfaces. However, the
temperature model may not work well with the flood caused by
heavy rainfall during rainy seasons in the summer when there is
relatively low or no temperature difference between land and
water. Domenikiotis et al. (2003) also used Normalized
Difference Vegetation Index (NDVI) to identify water from
land considering that water covered surfaces usually have very
small or even negative NDVI values. It can be seen from its
mathematical definition that the NDVI of an area containing a
dense vegetation canopy will tend to have positive values (say
0.3 to 0.8), while standing water (e.g., oceans, seas, lakes and
rivers), which have a rather low reflectance in both visible
(VIS: from 0.4 to 0.7 pm) and near-infrared (NIR: from 0.7 to
1.1 pm) spectral bands, result in very low positive or even
slightly negative NDVI values.
Regression trees have been used with remote sensing
observations (DeFries et al., 1997; Mchaelson, Schimel, Friedl,
Davis and Dubayah, 1994; Prince and Steninger, 1999; Hansen
et al., 2002, Solomatine and Xue, 2004). They provide a robust
tool to handle nonlinear relationship within large data sets.
As described above, in previous studies, several
parameters, including the reflectance of near infrared (NIR)
channel, the reflectance ratio and difference between NIR and
visible (VIS) channels, NDVI, brightness temperature at 11 or
12 pm, and surface temperature, might be used to identify water
from land. Linear mixture model has been used by Sheng et al.
(2001) to derive water fraction. However, it has not yet been
shown which parameter or combination of several parameters is
the most effective?
This paper explores how to derive water fraction and flood
map from the MODIS data using regression tree (RT) method.
Section 2 introduces the dataset used. The physics of the
problem and decision algorithms are described in Section 3.
Section 4 presents the results and Section 5 gives a summary
and discussion.
2. DATA USED
• Surface water percentage data derived from derived
from the 1km land/water map supplied by the USGS
Global Land Cover Characterization Project. The
percentage water was created by simply determining
the percentage of 1km pixels designated as water in
each 10' region. This data can be obtained from the
Surface and Atmospheric Radiation Budget (SARB)
working group, part of NASA Langley Research
Center's Clouds and the Earth's Radiant Energy
System (CERES) mission
• MODIS L3 8-day composite surface reflectance
product (MYD09A1) that is computed from the
MODIS Level IB land bands 1, 2, 3, 4, 5, 6, 7, which
are centered at 0.648 pm, 0.858 pm, 0.470 pm, 0.555
pm, 1.24 pm, 1.64 pm, and 2.13 pm, respectively.
The product is an estimate of the surface reflectance
for each band as it would have been measured at
ground level after removing the atmospheric
scattering and absorption.
• MODIS LIB calibrated reflectance at the Top of
Atmosphere (TOA) with 1 km resolution
(MOD021KM).
• MODIS geolocation fields (MOD03).
• MODIS cloud mask (MOD35) data.
• TM (Thematic Mapper) data from the Landsat
observations at 30-meter spatial resolution is
used to evaluate water fraction derived from
MODIS.
3. METHODOLOGY
The RT, such as the M5P, is a powerful tool for generating rule-
based models that balance the need for accurate prediction
against the requirements of intelligibility. RT models generally
give better results than those produced by simple techniques
such as multivariate linear regression, while also being easier to
understand than neural networks. Unlike neural networks, the
RT program generates a model with rules that describe the
relationships between the independent and dependent
parameters in the data set. Instead of simple regression analysis
techniques, RT uses a piecewise regression technique. The
piecewise regression analysis (classifying the data into different
subsets) will yield different regression fits for different
meteorological conditions, unlike a simple regression analysis.
The RT program constructs an unconventional type of tree
structure, with the leaves containing linear models instead of
discrete classes by DT. A decision tree would categorize the
predictions into discrete classes, but the regression tree predicts
actual continuous values.
Since RT integrates DT with traditional regression
analysis. Like DT algorithm, RT algorithm can integrate all the
possible candidate predictors, such as the MODIS channel 2
reflectance (CH2) and channel 1 reflectance CHI, the
reflectance ratio (CH2/CH1) and difference (CH2-CH1)
between MODIS channel 2 and channel 1, NDVI, Normalized
Water Difference Index (NDWI), etc., meanwhile it can
determine continuous values, in this case water fraction, and
giving accuracy estimates. The NDWI [45], a satellite-derived
index from the Near-Infrared (NIR) and Short Wave Infrared
(SWIR) channels, is also included as one input attribute.
According to Gao [45], NDWI is a good indicator for
vegetation liquid water content and is less sensitive to
atmospheric scattering effects than NDVI. The MODIS 8-day
composite data at 500-m resolution is aggregated to the same
1/6 degree resolution of the surface water percentage map.
In this study, the M5P (Wang and Witten, 1997), a
reconstruction of Quinlan's M5 algorithm (Quinlan, 1992) for
inducing trees of regression models, is used to derive water
fraction from MODIS observations. The M5P combines a
conventional decision tree with the possibility of linear
regression functions at the nodes. Techniques devised by
Breiman et al. (1984) for their CART (Classification and
Regression Trees) system are adapted in order to deal with
enumerated attributes and missing values. Uses features from
the well-known CART system and reimplements Quinlan"s
well-known M5 algorithm with modifications and seems to
outperform it. M5P can deal effectively with enumerated
attributes and missing values.