Full text: Papers accepted on the basis of peer-reviewed full manuscripts (Part A)

In: Paparoditis N., Pierrot-Deseilligny M.. Mallet C.. Tournaire O. (Eds). IAPRS. Vol. XXXVIII. Part ЗА - Saint-Mandé, France. September 1-3, 2010 
240 
2. PROCESSING CHAIN 
In this section we provide an overview of the proposed 
processing chain (Fig. 1). It can roughly be subdivided into five 
steps: 1) line extraction. 2) projection of all lines to a reference 
coordinate system. 3) extraction of features, 4) training of the 
CRF parameters using ground truth, and 5) classification into 
building and non-building sites. The output is a label image 
showing building and non-building sites. 
First, 3D lines are computed from the optical stereo images 
(section 3.2) and double-bounce lines are segmented in the 
InSAR data (section 3.3). Both line sets are then projected from 
the sensors' coordinate systems to the reference coordinate 
system of the orthophoto. Thereafter, a feature vector is 
computed for each site. In our case, an image site corresponds 
to a square image patch as traditionally used for both computer 
vision (e.g., Kumar and Hebert. 2003) and remote sensing 
applications of CRFs (e.g., Zhong and Wang, 2007). In 
addition, we adapt the idea of Kumar and Hebert (2006) and 
compute those features in three different scales. Then, the 
parameters of the CRF are trained on a subset of the data using 
ground truth. Subsequently, inference is conducted and the test 
data are classified into building sites and non-building sites (see 
CRF details in section 4). 
3. FEATURES 
Usually, high-resolution multi-spectral orthophotos are widely 
available and thus we take an orthophoto as the basic source of 
features for building detection. In order to assess the impact of 
height data on the building detection results of the CRF 
framework we also investigate optical stereo imagery. In very 
high-resolution aerial imagery characteristic objects of urban 
areas, particularly buildings, become visible in great detail (Fig. 
2(a)). High-resolution SAR data provides complementary 
information. Double-bounce lines occurring at the position 
where the building wall meets the ground are characteristic 
Figure 1. Flowchart of the processing chain for building 
detection 
features (Thiele et al., 2010). Fig. 3(a) compares the sensor 
geometries and the projected lines in ground geometry. 
Disregarding all projection artefacts, the double-bounce line of 
a flat-roofed building (with vertical walls) is located at the same 
position as the stereo line representing the roof edge (neglecting 
overhang). Note that the roof segment of the building in the 
orthophoto we use falls over double-bounce line and stereo line 
since we are not dealing with a true orthophoto (cf. Fig. 3(b,c)). 
The focus of this research is neither on particularly 
sophisticated features nor on sophisticated feature selection 
techniques but on the overall suitability assessment of CRFs for 
building detection with multi-sensor data. Therefore, rather 
simple features are selected and feature selection is 
accomplished empirically. 
3.1 Orthophoto features 
We test various combinations of features (colour, intensity, and 
gradient) of the orthophoto within the CRF framework and 
choose those that provide the best results. The most suitable 
features are found based on colour, intensity, and gradient. As 
colour features we take mean and standard deviation of red and 
green channel normalized by the length of the RGB vector. 
Mean and standard deviation of the hue channel are found to be 
discriminative, too. Furthermore, variance and skewness of the 
gradient orientation histogram of a patch proved to be good 
features. The images are subdivided into square image patches 
and features are calculated within each patch. Of course, the 
choice of patch size is a trade-off. A small patch size is 
desirable in order to detect buildings in detail. However, too 
small patches lead to instable features resulting in less reliable 
estimates of the probability density distributions. We apply a 
multi-scale approach to mitigate those shortcomings (Kumar 
and Hebert, 2006). Each feature is calculated for different patch 
sizes and all scales are integrated into the same feature vector. 
We follow this approach and test various numbers of scales and 
scale combinations. Three different scales (10x10, 15x15, and 
20x20 pixels) are found to provide good results. Features of 
large patches integrate over bigger areas thus excluding, for 
example forests or agricultural areas whereas the small patches 
provide details. 
3.2 Stereo lines 
We extract 3D lines from a pair of aerial images using the pair 
wise line matching approach proposed by Ok et al. (2010). At 
this point we only briefly summarize the algorithm and refer the 
reader to the reference for further details. The entire algorithm 
consists of four main steps: pre-processing, straight line 
extraction, stereo matching of line pairs, and post-processing. 
Pre-processing contains smoothing with a multi-level non-linear 
colour diffusion filter and colour boosting in order to 
exaggerate colour differences in each image. Next, straight lines 
are extracted in each of the stereo images. A colour Canny edge 
detector is applied to the pre-processed images. Thereafter, 
straight edge segments are extracted from the edge images using 
principal component analysis followed by random sampling 
consensus. Subsequently, a new pair-wise stereo line matching 
technique is applied to establish the line to line correspondences 
between the stereo images. The pair matches are assigned after a 
weighted matching similarity score, which is computed over a 
total of eight measures.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.