Figure 1:
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B8, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
The state of
Maharashtra as obtained from two DMSP-OLS images of 2001. (a) Maharashtra shown using the stable lights dataset. (b) Maharashtra
shown using the radiance calibrated dataset (showing brightness values)
percentage of households with cars, jeeps and vans; percentage
of households with television; percentage of permanent census
houses and percentage of households using electricity as power
source. Maps are produced for the villages in the districts of
Pune in the state of Maharashtra. The use of multi-scale data led
to the consideration of issues arising from MAUP and
ecological fallacy which are also described in this paper.
2. METHOD
The study uses two types of DMSP-OLS data products: the
stable light data set and the brightness data. The stable light
data was part of the latest average DN data series and was
obtained from the National Geophysical Data Centre (NGDC)
website (National Geophysical Data Centre 2006). In this
image, the data values range from 1-63. Background noise in
the data is represented using zero while areas with no cloud-free
observations are denoted by the value of 255. The second
DMSP-OLS image used in the study is the global composite of
brightness data for 2000 — 2001. It was prepared from fixed
gain images taken from satellites F12 to FI5 by NGDC.
However, this data contained brightness values ranging from 0
to 653 and was not calibrated to radiance (Tuttle 2008).
The mean and standard deviation of stable lights and brightness
were calculated for 32 districts and all the taluks in the state of
Maharashtra. There are 35 districts in the state of Maharashtra.
Of these, the districts of Mumbai, Greater Mumbai and Thane
were not included for sample selection as they had very high
values of both mean and standard deviation of brightness and
stable lights compared to others. From the remaining 32
districts, 24 were randomly selected and 8 districts were
withheld for model validation. Although the census accounts for
354 taluks, data was available for only 286 taluks. As a result
the analyses were conducted on the available taluks. 196 taluks
were randomly sampled for model development and the
remaining 90 taluks were withheld for model validation.
Five demographic metrics and four socio-economic metrics
were chosen from the census. Ten metrics were shortlisted after
a number of statistical tests and are listed in table 1.
From the ten census metrics selected for this study, only three
variables were available from the Indian census at the scale of
a village. They are number of households per square kilometre,
total population per square kilometre and total workers per
square kilometre. The models proposed at the district and taluks
were used to predict and map the metrics for the villages that
are unavailable from traditional census statistics. These metrics
include: number of female literates per square kilometre;
percentage of households with cars, jeeps and vans; percentage
of households with television; percentage of permanent census
houses and percentage of households using electricity as power
source. Maps were produced for the villages in the districts of
Pune in the state of Maharashtra. The district of Pune has the
million plus city of Pune as its district headquarter along with
some very rural areas appearing dark in the satellite image.
3. RESULTS AND DISCUSSION
3.1 Models proposed at district and taluks
Linear regression models and multiple regression models were
proposed. The selected census metrics were chosen as the
dependent variables and mean and standard deviation of
brightness and stable lights obtained from the images were used
as the independent variables. The models were validated using
the withheld districts and taluks. The models which best
predicted the census metrics (€ 2596 error margin) for the
highest number of districts and taluks were identified as the
most appropriate models (Roychowdhury et al. 2011b;
Roychowdhury et al. 2010).
The selected census metrics showed positive correlations with
both the mean and standard deviation of brightness and stable
lights. The adjusted r^ of these models ranged from 0.8 to 0.97
at 95% confidence interval at the district level. The correlation
coefficients (r)) achieved at 95% confidence interval for all the
census metrics ranged from 0.2 to 0.8 for the taluks. The
adjusted r? values of the models are presented in details in
previous works by the authors (Roychowdhury et al. 2011b;
Roychowdhury et al. 2010).
Internati