In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 — Paris, France, 3-4 September, 2009
RGB aerial image extracts
Initial watershed segmentations
Improved watershed segmentations
Figure 5: Results of simple building scenes. Again, the
gray-shadowed regions have been merged on the basis on
geometric properties.
in a preprocessing step, e. g. (Zhao and Lei, 2006). A fur
ther improvement should be acchieved, if the whole proce
dure is repeated, because the MDL-based merged regions
are now big enough for determination of their geometric
properties.
The noise of the point cloud, which we derive from the
semiglobal matching does not disturb the merging of im
age regions. Considering aerial images, we are faced with
large and often planar objects. There, our plane estimation
is good enough, because we do not have to many outliers.
Otherwise, the plane estimation should be done by a robust
estimator. If different object parts have been segmented
as one region, then the most dominant plane of the com
bined region often does not represent one of these object
parts. This shows us, that we need to focus in the future
on an algorithm for detecting multiple planes (e. g. analy
sis of the best five planes from RANSAC) and a splitting
routine. Furthermore, there are objects as trees or dorm
ers which violate our assumption of having one planar sur
face. Therefore, we consider to adapt our plane estimation
towards extracting general geometric primitives as planes,
cylinders, cones and spheres, cf. (Schnabel et al., 2007).
With respect to facade images, we have big trouble with
our plane estimation. We ascribe this fact to two major
reasons. First, the reconstruction part is challenged by ho
mogenous facades and mirroring or light transmitting win-
Figure 6: Facade image and different views on fitted planes
for hand-labeled object parts. Wall components are drawn
in yellow, windows in blue and (if opened) in green, bal
cony parts in magenta. The planes of overhanging build
ing parts are well distinguishable, but the window planes
(if not opened) are very close to its surrounding wall parts.
The mirroring and light transmission effects in the window
sections lead to geometrically instable plane estimations.
dows. Both cases lead to too many outliers. And secondly,
the noise of the complete point cloud is too high to differ
between planes in the object space, which are parallel, but
only a view centimeters apart. Fig. 6 shows a facade image
and three views on the dominant planes of given annotated
objects. In this case, the supporting points may have a dis
tance of 4 cm to the fitting plane. Dominant planes with
distances of more than half of a meter are clearly separable
from each other.
7 CONCLUSION AND OUTLOOK
We presented a novel approach for improving image seg
mentations for aerial imagery by combining the initial wa
tershed segmentation with information from a 3D point
cloud derived from two or three views. For each region,
we estimate the most dominant plane, and only the plane
parameters are used to trigger the merging process of the
regions. With respect to building extraction, our algorithm
achieves satisfying results, because the ground and major
building structures are better segmented.
In the next steps, we want to search for multiple planes for
each region, and we want to implement a splitting routine,
so that regions can either get merged or split. If we have
such a reliable function, we would start the region merging
using the MDL criterion based on the image intensities.
So, we can search for geometric descriptions in all, and not
only in the big image regions. Furthermore, our approach
215