Full text: Technical Commission VII (B7)

    
    
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
RANDOM FORESTS-BASED FEATURE SELECTION 
FOR LAND-USE CLASSIFICATION USING LIDAR DATA AND ORTHOIMAGERY 
Haiyan Guan*, Jun Yu, Jonathan Li*** , Lun Luo* 
aGeoSTARS Lab, Department of Geography and Environmental Management, University of Waterloo, 200 University 
Ave. West, Waterloo, ON, Canada N2L 3G1 
bGeoSTARS Group, School of Information Science and Engineering, Xiamen University, 422 Siming Road South, 
Xiamen, Fujian, China 361005 
CChina Transport Telecommunication & information Center, Beijing, China 
KEY WORDS: Lidar, imagery, Random Forests, Classification, Feature selection 
ABSTRACT: 
The development of lidar system, especially incorporated with high-resolution camera components, has shown great potential for 
urban classification. However, how to automatically select the best features for land-use classification is challenging. Random 
Forests, a newly developed machine learning algorithm, is receiving considerable attention in the field of image classification and 
pattern recognition. Especially, it can provide the measure of variable importance. Thus, in this study the performance of the Random 
Forests-based feature selection for urban areas was explored. First, we extract features from lidar data, including height-based, 
intensity-based GLCM measures; other spectral features can be obtained from imagery, such as Red, Blue and Green three bands, 
and GLCM-based measures. Finally, Random Forests is used to automatically select the optimal and uncorrelated features for land- 
use classification. 0.5-meter resolution lidar data and aerial imagery are used to assess the feature selection performance of Random 
Forests in the study area located in Mannheim, Germany. The results clearly demonstrate that the use of Random Forests-based 
feature selection can improve the classification performance by the selected features. 
1. INTRODUCTION 
Urban land cover classification has always been critical due to 
its ability to link many elements of human and physical 
environments. Timely, accurate, and detailed knowledge of the 
urban land cover information derived from remote sensing data 
is increasingly required among a wide variety of communities. 
This surge of interest has been predominately driven by the 
recent innovations in data, technologies, and theories in urban 
remote sensing. During the past decades, increasing advances in 
lidar technologies provide high-accuracy and point-density 3- 
dimensional point clouds for land-use classification in 
combination with imagery. As lidar data is unstructured, 
irregular 3-D points and short of spectral information, 
classification confusion is often generated between man-made 
and natural objects. On the other hand, it is difficult to directly 
obtain land-use information only from remotely sensed data, 
owing to the complexity of landscapes, spectrally identical 
objects, as well as abundance of spatial and spectral information. 
Therefore, integrating lidar point clouds with imagery is being a 
preferred means for land-use classification. 
Although a plethora of features that can be extracted from both 
lidar point clouds and optical imagery, there is no rule or model 
for how to automatically and objectively select proper features 
for the desired classification results. Majority of existing 
research works are focusing on the development of 
classification methods, few attentions are paid on the feature 
selection using lidar data and imagery. The subjective selection 
of classification features causes the classification results 
unstable. To this end, Random Forests-based feature selection is 
proposed in this study. 
  
*junli@uwaterloo.ca, phonel 519-888-4567, ext. 34504 
Random Forests, one of ensemble classification family that are 
trained and their results combined through a voting process, can 
be considered as an improved version of bagging, a widely used 
ensemble classifier (Breiman, 1996). It is well known that 
Random Forests are characterised by notably computational 
efficiency. In the field of remote sensing, Random Forests has 
been achieved a promising classification accuracy for hyper- 
spectral (Wang et al., 2009), multispectral (Stumpf and Kerle, 
2011), and multisource data (Gislason et al., 2006). Due to 
classification complexity of multisource data, commonly used 
parametrical classification methods are impropriate. Random 
Forests, as nonparametric classification algorithm, should be of 
great interest for multisource data by providing an estimate of 
individual variable importance index. Moreover, several studies 
have shown the advantages of Random Forests in land cover 
classification; the results indicate that the selected features agree 
the existing physiological knowledge. However, few is focus on 
urban areas by fusion of lidar data and aerial images. To this end, 
RF is applied to feature selection in this study. 
This paper is organized as follows. In section 2, we describe the 
basic principles of Random Forests, the lidar data and calibrated 
imagery used in the paper, features selected from the lidar data 
and imagery, respectively. Section 3 then discusses variable 
importance, one of the Random Forests’ measures, for all 
features, Random Forests-based feature selection and the 
corresponding classification results by Maximum Likelihood 
Classifier (MLC). Finally Section 4 concludes the proposed 
method.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.