Full text: The 3rd ISPRS Workshop on Dynamic and Multi-Dimensional GIS & the 10th Annual Conference of CPGIS on Geoinformatics

ISPRS, Voi.34, Part 2W2, “Dynamic and Multi-Dimensional GIS”, Bangkok, May 23-25, 2001 
104 
vegetation from man-made objects. Many proposed solutions 
use image edges as matching features. The problem is that 
these edges are usually detected without reference to the 
objects themselves. In addition, most satellite images often 
contain complex natural scenes with a resolution of at most a 
few meters. For these reasons, edges do not provide the 
discrimination performance needed. Our approach, therefore, is 
to use the NDVI for feature extraction. NDVI allows us to identify 
the most effective matching features. The index can reliably 
detect man-made objects in satellite images. 
Our experiments used satellite images acquired by a 
commercial satellite. With the rapid development in remote 
sensing, we are easy to get these images. Current satellite 
imaging systems can take multi-spectral data over wide areas at 
a time with a resolution of a few meters. However, the accuracy 
of original position alignment is on the order of 1/25,000 scale 
map, which usually means a numerical position of 10 meters. 
Reference and feature pixels are used for matching. The former 
are calculated by projecting map objects onto satellite images 
with their coordinates. The latter are calculated from satellite 
images as follows. The NDVI of an N x M satellite image is 
given by the discrete function l(x, y), xsix= <,2,3,- - -,N), yely= 
{1,2,3,- • -,M}, where l(x, y) is the index value. I(x, y) is 
calculated using the following arithmetic operation. 
I(x, y) = (IR(x, y) - R(x, y)) / (IR(x, y) + R(x, y)) (1) 
IR (R) is the reflectance value in the near infrared channel 
(visible red channel) region. I(x, y) are normally used in 
identifying vegetation. Therefore, the binarization of l(x, y) leads 
to an image containing only vegetation. We use l(x, y) in the 
reverse sense to identify man-made objects. The effective 
threshold for binarization needs to be variable within the same 
image, because it is uneven to the index distribution. N x M 
satellite image is divided through n x m window. In this case, 
the image contains (N / n) x (M / m) windows. The window size 
(n x m) depends on the characteristics of the image. I(x, y) is 
binarized using the threshold for each window. Each threshold 
is calculated by average filtering toward I(x0, yO), where xO is lx 
included in the window region, yO is also ly included in the 
window region. A pixel (x, y) is identified as a matching feature, 
if l(x, y) is below the adaptive threshold. This binarization is 
performed to all windows. In addition, isolated pixels are 
eliminated as features to suppress the obvious errors caused by 
shadow effects and spectral deviations. The remaining pixels 
are identified as feature pixels. 
3. MISMATCH DETERMINATION 
This section describes our approach to determine the mismatch 
between satellite images and vector maps. The general idea is 
to determine the most reliable correspondence between 
projected map objects and real objects shown in satellite 
images by voting. Voting is based on the Generalized Hough 
Transform (GHT), which is often used to estimate geometry 
conversion parameters. GHT can be used when a known figure 
(i.e. template) exists in an arbitrary background and can account 
for unknown parallel displacement, rotation, and expansion 
(Duda & Hart, 1972). Here, it is assumed that only position shift 
(which means parallel displacement conversion) need be 
considered, because it is necessary in GIS to use satellite 
images and vector maps as they are. The parallel 
displacements in the x-direction and y-direction are estimated 
separately by one-dimensional GHT. 
The displacements are determined as follows. Reference and 
feature pixels are extracted as mentioned above. The scan area 
is assumed to be bigger than the area that holds the map object 
area projected in the satellite image. The differences between 
the positions of all these pixels on each scan line are calculated. 
Each calculated value is taken as one vote. The peak 
corresponds to the number of votes obtained in the scan area. 
Displacement candidates are identified as those with high voting 
frequency on all scan lines. High vote frequency means that the 
voting score exceeds a threshold. 
The final displacement is selected from among the candidates. 
The selection should pay attention to the consistency of pixel 
matching after displacement. Displacement candidates are 
estimated using the mean square error of all differences 
between corresponding pairs of feature and reference pixels 
(displaced). Here, it is assumed that the displacement is correct 
if feature pixels are sufficiently consistent with reference pixels. 
The mean square error is calculated for each candidate. The 
determined displacement is the one with the lowest mean 
square error value. Thus, the displacements in the x- and y- 
directions are estimated separately. 
4. EXPERIMENTAL RESULTS 
We performed experiments to verify our approach; actual 
satellite images and corresponding vector maps were used. 
Fig. 1 shows one part of a typical satellite image (RGB format) 
as acquired by IKONOS (Space Imaging). These images have 
multi-spectral data (red channel, green channel, blue channel 
and near infrared channel) with 11 bit steps and 4 meter (per 
pixel) resolution. 
Figure 1 An example of satellite image (RGB color) 
Commercial 1/2,500 scale maps of the same test area were 
used. The maps include topographical data identifying several 
types of man-made objects, such as buildings. An example is 
shown in Fig. 2. This figure contains some layers with regard to 
buildings, houses, and roads (which are man-made objects) to 
extract reference pixels. 
Following the procedures described in section 2, feature pixels 
regarded as man-made objects were extracted from the satellite 
image. Fig. 3 shows the red channel of the satellite image in 
Fig. 1. Fig. 4 shows the near infrared image. NDVI l(x, y) was 
calculated by Eq. 1 using the spectral reflectance values. The 
result is shown in Fig. 5, where l(x, y) values have been 
converted into gray scale values. The binarization and 
elimination of l(x, y) were performed using the 20 pixel x 20 pixel 
window. The resulting extracted features are shown in Fig. 6. 
The results in Fig. 6 demonstrate that the feature pixels were 
extracted comparatively well. Extracted pixels were also quite 
accurate; the main errors were excessive extraction and 
extraction mistakes. The results show that the NDVI approach 
is stable and reliable enough to make classification of man-
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.