GROUND OBJECT RECOGNITION USING COMBINED HIGH RESOLUTION
AIRBORNE IMAGES AND DSM
Qingming Zhan*"*, Yubin Liang", Chi Wei", Yinghui Xiao^"
“School of Urban Design, Wuhan University, Wuhan 430072, China
® Research Center for Digital City, Wuhan University, Wuhan 430072, China
* School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
gmzhan@whu.edu.cn; lyb.whu@gmail.com; 544496077 @qq.com; yhxiaoitc@126.com
Commission III, WG III/4
KEY WORDS: Ground Object Recognition, High Resolution, LIDAR, DSM, Sparse Representation, L1 Minimization
ABSTRACT:
The research is carried on dataset Vaihingen acquired from ISPRS Test Project on Urban Classification and 3D Building
Reconstruction. Four different types of ground objects are extracted: buildings, trees, vegetation (grass and low bushes) and road.
Spectral information is used to classify the images and then a refinement process is carried out using DSM. A novel method called
Sparse Representation is introduced to extract ground objects from airborne images. For each pixel we extract its spectral vector and
solve Basis Pursuit problem using 11 minimization. The classification of the pixel is same as the column vector of observation matrix
corresponding to the largest positive component of the solution vector. A refinement procedure based on elevation histogram is
carried on to improve the coarse classification due to misclassification of trees/vegetation and buildings/road.
1. INTRODUTION
In recent years LiDAR (Light Detection And Ranging) has
emerged as a new technology which provides valuable data in
various forms and scales for mapping and monitoring land
cover features. Its use has increased dramatically due to
availability of high-density LiDAR data as well as high
spatial/spectral resolution airborne imageries. However the data
from these different sensors have their own characteristics.
Spatial information which can be used to derive highly accurate
DSM of scanned objects can be directly obtained from LiDAR
data. On the other hand, high resolution airborne imageries
offer very detailed spectral/textural information of ground
objects. Although aerial photography has been used as a
mapping tool for a century, the fusion of aerial photography and
LiDAR data has only been possible in the past few years due to
advances in sensor design and data acquisition/processing
techniques (Baltsavias, 1999). So combining these two kinds of
complementary datasets is quite promising for improving land
cover mapping (Tao and Yasuoka, 2002).
There have been some attempts to fuse LiDAR and high-
resolution imagery for land cover mapping and very promising
results are shown in recent years. Haala and Brenner (1999)
combined a LiDAR derived DSM with three-color-band aerial
images to apply unsupervised classification based on the
ISODATA (Iterative Self-Organizing Data Analysis Technique)
algorithm to normalized Digital Surface Model (nDSM) and
CIR image. In their experiment, nDSM was used to classify
objects which had different distribution patterns in elevation
direction. The low-resolution LiDAR data was greatly
facilitated to separate trees from buildings by the near-infrared
band from the aerial imagery. Schenk and Csatho (2002)
exploited the complementary properties of LiDAR and aerial
images to extract semantically meaningful information.
Rottensteiner et al. (2005) used a LiDAR derived DTM and the
Normalised Difference Vegetation Index (NDVI) from
multispectal images to detect buildings in densely built-up
urban areas. The rule-based classification scheme applied
Dempster-Shafer theory to delineate building regions,
combining NDVI and the average relative heights to separate
buildings from other objects. Ali et al. (2005) applied an
automated object-level technique based on hierarchical decision
tree to fuse high-resolution imagery and LiDAR data. Sohn and
Dowman (2007) presented an approach for automatic extraction
of building footprints in a combination of multispectral imagery
and airborne laser scanning data. The presented method utilized
a divide-merge scheme to obtain the recognized building outline.
A comparison of pixel- and object-level data fusion and
subsequent classification of LiDAR and high-resolution
imagery was carried out by Ali et al. (2009). The results showed
that fusion of the color imagery and the DSM generally
exhibited better results than sole classification of color imagery.
The underlying assumption of fusion of multisource data is that
classification accuracy should be improved due to more
incorporated features (Tso and Mather, 2001). Image fusion can
be performed at pixel-, object or feature- and decision-levels
(Pohl and van-Genderen, 1998; Schistad-Solberg et al., 1994).
Pixel level fusion focused on the merging of physical
parameters derived from multisource data. It is very sensitive to
geo-referencing and pixel spacing and topological information
is often not used in the fusion and subsequent procedures.
Object-level image fusion methods usually segment multisource
data into meaningful objects which consists of many data units.
* Qingming Zhan, Professor and Associate Dean of School of Urban Design, Deputy Director of Research Centre for Digital City,
Wuhan University.
This kinc
and spat
segmente
using fu:
pattern re
Nowaday
returns o1
multispec
LiDAR d
fusion ar
Opportun
achieved |
still chal]
measurerr
et al., 200
The work
The main
emphasis
2.1 Data
Orientatio
first to gu
same spat
file with 1
and mosa
task 1s cor
We combi
interest (
Area3 are
four ‘banc
contrast-ei
2.2 Grov
Buildings,
vegetation
extraction.
distinctive
ground ob
classificati
informatio
process is
we use to
seminal w
Candès an
and Elad
The key ic
G value) «
ground ob
Basis Pu
programm