Full text: Papers accepted on the basis of peer-reviewed abstracts (Part B)

In: Wagner W., SzSkely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, IAPRS, Vol. XXXVIII, Part 7B 
2.5 Selection of features for the estimation of forest 
attributes 
The ¿-nearest neighbor (¿-nn) method was used to estimate 
forest variables (e.g. Kilkki & Paivinen, 1987; Tokola et al., 
1996). . The value of k was set to 5, euclidean distances were 
used to measure closeness in the feature space and the nearest 
neighbors were weighted with the squared inverse distances. 
The accuracy of the estimates produced by the k-nn estimator 
was tested via leave-one-out cross-validation on the field plots 
by comparing the estimates of each field plot to the measured 
value (ground truth) of the plot. The accuracy of the estimates 
was measured by the relative root mean square error RMSE 
(Equation 1). 
rmse% = \oo* rmse a) 
y 
number of features to a reasonable minimum. Only features 
belonging to the best genome in each step were included in the 
next step. Feature selection was run separately for both areas 
and each feature extraction unit (field plot, small segments, 
large segments). 
There were 14 (study area 1) or 19 (study area 2) features 
selected into the final Grid sets, 17 or 19 into the Seg350 sets 
and 12 or 17 into the SeglOOO sets. Of the selected features, 
majority (63-79%) were based on the ALS data. 
3. RESULTS AND DISCUSSION 
In both study areas the features extracted from square grid 
elements worked better in estimating the forest attributes than 
the features extracted from image segments. Furthermore, 
features from image segments derived using minimum size of 
350 m 2 performed better in the estimation than features 
extracted from larger segments (minimum size 0.1 ha). 
where: 
RMSE = 
£(*-*) 2 
y, = measured value of variable y on plot i 
y>i = estimated value of variable y on plot i 
y = mean of the observed values 
n = number of plots 
Automatic feature selection was carried out using a simple 
genetic algorithm presented by Goldberg (1989), and 
implemented in the GAlib C++ library (Wall 1996). The GA 
process starts by generating an initial population of strings 
(chromosomes or genomes), which consist of separate features 
(genes). The strings evolve during a user-defined number of 
iterations (generations). The evolution includes the following 
operations: selecting strings for mating using a user-defined 
objective criterion (the better the more copies in the mating 
pool), letting the strings in the mating pool to swap parts 
(crossing over), causing random noise (mutations) in the 
offspring (children), and passing the resulting strings into the 
next generation. 
In the present study, the starting population consisted of 300 
random feature combinations (genomes). The length of the 
genomes corresponded to the total number of features in each 
step, and the genomes contained a 0 or 1 at position i, denoting 
the absence or presence of image feature i. The number of 
generations was 30. The objective variable to be minimized 
during the process was a weighted combination of relative 
RMSEs of ¿-nn estimates for mean total volume, mean volumes 
of Scots pine, Norway spruce and deciduous species, mean 
diameter and mean height, with total volume having a weight of 
50%, and the remaining variables 10% each. Genomes that were 
selected for mating swapped parts with each other with a 
probability of 80%, producing children. Occasional mutations 
(flipping 0 to 1 or vice versa) were added to the children 
(probability 1%). The strings were then passed to the next 
generation. The overall best genome of the current iteration was 
always passed to the next generation, as well. Four successive 
steps (all including 30 generations) were taken to reduce the 
Study area 2 had generally better estimation accuracy compared 
to data sets of study area 1. The main reason for this is probably 
the higher number of sample plots in study area 2, which gives 
higher number of potential nearest neighbors for each sample 
plot in the ¿-nn estimation. The estimation accuracy results for 
the forest attributes used in this study are presented in tables 2 
and 3. 
GRID 
SEG350 
SEG1000 
Height 
18.5 
22.4 
25.5 
Diameter 
25.5 
27.7 
32.0 
Total volume 
27.8 
34.0 
36.6 
Volume of pine 
74.2 
77.1 
99.9 
Volume of spruce 
83.9 
87.5 
103.3 
Volume of 
deciduous sp. 
85.3 
88.7 
93.9 
Table 2. Estimation results for the feature sets (relative RMSE, 
%) of study area 1 
GRID 
SEG350 
SEG1000 
Height 
12.5 
13.9 
16.5 
Diameter 
19.8 
23.1 
25.2 
Total volume 
29.6 
32.9 
36.6 
Volume of pine 
125.2 
138.5 
137.0 
Volume of spruce 
59.0 
61.5 
63.8 
Volume of 
deciduous sp. 
99.2 
113.4 
111.3 
Table 3. Estimation results for the feature sets (relative RMSE, 
%) of study area 2 
There were large differences between the study areas in the 
estimation accuracy of the volumes per tree species groups. 
Apparently, the differences were caused by the different tree 
species structure of the two study areas. Typically, the dominant 
tree species had the highest estimation accuracy, and the less 
dominant lowest. On the other hand, the volume of deciduous 
trees had better estimation accuracy compared to the minority
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.