Full text: Technical Commission VII (B7)

  
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B7, 2012 
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia 
   
  
rmalized-Var 
  
+ 
+ 
  
     
Anistropy 
  
P-Deviation-Ang 
  
Sphericity 
= 
un 
  
  
(c) Lidar Height-based Features 
  
Figure 2. Overview of features from lidar and orthoimagery 
3. EXPERIMENTS AND DISCUSSION 
To assess the effectiveness of Random Forests in feature 
selection, three experiments are conducted. First one is focusing 
on variable importance by importing all features into Random 
Forests; second, recursive feature selection with Random 
Forests is conducted to searching most important features for 
the satisfied classification results; finally, classification results 
using features selected by Random Forests is performed. 
3.1 Variable importance results 
The variable importance for training samples is displayed in 
Figure 3 for each feature when all features are put in the 
Random Forests. The variable importance is demonstrated by 
the mean decrease permutation accuracy. As can be seen in the 
figure, among those 48 features it appears that the most relevant 
features include nDSM, eigenvalue-based anisotropy, intensity 
GLCM measures, etc. For the aerial image-based features 
GLCM measures such as Ent., Corr, and Var. are not important 
for urban classification. 
3.2 Feature selection results 
To eliminate less important and more correlated features, 
iterative backward elimination scheme is used (Diaz-Uriarte and 
Alvarez de Andres, 2006). We first compute measures of feature 
importance to obtain an initial variable ranking and then 
proceed with an iterative backward elimination of the least 
important variables. In each iteration the least important 
features (by default, 20%) are eliminated, and a new RF is built 
by training with the remaining features for the assessment of 
OOB errors based on OOB samples. The iterative procedure 
proceeds until the final features with the lowest OOB errors are 
determined for the land-use classification. In this study the 
number of trees (T) is set up 100-200, and the number of split 
variables is 4. Generally, the default setting of split variables is 
a good choice of OOB rate. Using OOB errors, the original 48 
features are gradually eliminated up to 15 features. Meanwhile, 
as can be seen in Figure 4,the mean decrease accuracy is 
increasing with the decrease of numbers of features. The left 
fifteen features includes Lidar-NDVI, lidar height-based 
measures  eigenvalue-Anistropy, nDSM, P-Normalized-Var, 
Height-Diff; Lidar intensity-based GLCM-Var., -Mean, and -SM, 
and aerial image-based GLCM-Homo and -Diss. 
Based on these features from 48 to 15, maximum likelihood 
classifiers are used to get the classification results, as can be 
shown in The Figure 5. A classification error matrix (confusion 
matrix) is an effective way to quantitatively assess accuracy in 
that it compares the relationship between known reference data 
and the corresponding results of the classification (Congalton, 
1991). Kappa coefficient measures the accuracy between 
classification result and reference data using the major diagonal 
and the chance agreement (Jensen, 2005). From the Kappa 
coefficients, the classification accuracy is not improved with the 
increase of features. On the contrary, their classification 
accuracies are decreasing. The reason is that much more 
features are correlated than that of features with the significant 
important index. 
   
  
  
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.