Full text: CMRT09

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009 
211 
IMPROVING IMAGE SEGMENTATION USING MULTIPLE VIEW ANALYSIS 
Martin Drauschke, Ribana Roscher, Thomas Labe, Wolfgang Forstner 
Department of Photogrammetry, Institute of Geodesy and Geoinformation, University of Bonn 
martin.drauschke@uni-bonn.de, rroscher@uni-bonn.de, laebe@ipb.uni-bonn.de, wf@ipb.uni-bonn.de 
KEY WORDS: Image Segmentation, Aerial Image, Urban Scene, Reconstruction, Building Detection 
ABSTRACT 
In our contribution, we improve image segmentation by integrating depth information from multi-view analysis. We 
assume the object surface in each region can be represented by a low order polynomial, and estimate the best fitting pa 
rameters of a plane using those points of the point cloud, which are mapped to the specific region. We can merge adjacent 
image regions, which cannot be distinguished geometrically. We demonstrate the approach for finding spatially planar 
regions on aerial images. Furthermore, we discuss the possibilities of extending of our approach towards segmenting 
terrestrial facade images. 
1 INTRODUCTION 
The interpretation of images showing building scenes is a 
challenging task, due to the complexity of the scenes and 
the great variety of building structures. As far as human 
perception is understood today, humans can easily group 
visible patterns and use their shape to recognize objects, 
cf. (Hoffman and Richards, 1984) and (Treisman, 1986). 
Segmentation, understood as image partitioning often is 
the first step towards finding basic image patterns. Early 
image segmentation techniques are discussed in (Pal and 
Pal, 1993). Since then, many other algorithms have been 
proposed within the image analysis community: The data- 
driven approaches often define grouping criteria based on 
the color contrast between the regions or on textural infor 
mation. Model-driven approaches often work well only on 
simple scenes e. g. simple building structures with a flat 
or gabled roof. However, they are limited when analyzing 
more complex scenes. 
Since we are interested in identifying entities of more than 
two classes as e.g. buildings, roads and vegetation objects, 
we cannot perform a image division into fore- and back 
ground as summarized in (Sahoo et al., 1988). Our seg 
mentation scheme partitions the image into several regions. 
It is very difficult to divide an image into regions if some 
regions are recognizable by a homogenous color, others 
have a significant texture, and others are separable based 
on the saturation or the intensity, e. g. (Fischer and Buh- 
mann, 2003) and (Martin et al., 2004). However, often 
such boundaries are not consistent with geometric bound 
aries. According to (Binford, 1981), there are seven classes 
of boundaries depending on illumination, geometry and re 
flectivity. Therefore, geometric information should be in 
tegrated into the segmentation procedure. 
Our approach is motivated by the interpretation of building 
images, aerial and terrestrial, where many surface patches 
can be represented by low order polynomials. We assume a 
multi-view setup with one reference image and its intensity 
based segmentation, which is then improved by exploiting 
the 3D-information from the depth image derived from all 
images. Using the determined orientation data, we are able 
to map each 3D point to an unique region. Assuming, ob 
ject surfaces are planar in each region, we can estimate a 
plane through the selected points. The adjacent regions are 
merged together if they have similar planes. Finally, we 
obtain an image partition with regions representing dom 
inant object surfaces as building parts or ground. We are 
convinced that the derived regions are much better for an 
object-based classification than the regions of the initial 
segmentation, because many regions have simple, charac 
teristic shapes. 
The paper is structured as followed. In sec. 2 we discuss 
recent approaches of combining images and point cloud 
information, mostly with the focus on building reconstruc 
tion. Then in sec. 3 we briefly sketch our approach for 
deriving a dense point cloud from three images. So far, our 
approach is semi-automatic due to the setting of the point 
cloud’s scale, but we discuss the possibility of automatiza 
tion for all its steps. In sec. 4 we present how we estimate 
the most dominant plane in the dense point cloud restricted 
on those points, which are mapped to pixels of the same re 
gion. The merging strategy is presented in sec. 5. Here we 
only study the segmentation of aerial imagery and present 
our results in sec. 6. Adaptations for segmenting facade 
images are discussed in each step separately. We summa 
rize our contribution in the final section. 
2 COMBINING POINT CLOUDS AND IMAGES 
The fusion of imagery with LIDAR data has successfully 
be done in the field of building reconstruction. In (Rotten 
steiner and Jansa, 2002) regions of interests for building 
extraction are detected in the set of laser points, and pla 
nar surfaces are estimated in each region. Then the color 
information of the aerial image is used to merge adjacent 
coplanar point cloud parts. Contrarily, in (Khoshelham, 
2005) regions are extracted from image data, and the spa 
tial arrangement of corresponding points of a LIDAR point 
cloud is used as a property for merging adjacent regions. 
In (Sohn, 2004) multispectral imagery is used to classify 
vegetation in the LIDAR point cloud using a vegetation in 
dex. The advantage of using LIDAR data is to work with 
high-precision positioned points and a very limited portion 
of outliers. The disadvantage is its expensive acquisition, 
especially for aerial scenes. Hence, we prefer to derive a 
point cloud from multiple image views of an object.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.