where each particle stands for an urban object. Preknowledge
about building shapes is used to model these particles. Arefi et
al. (Arefi et al., 2008) extracted above-ground objects from LI-
DAR data. Then, 3D buildings are reconstructed by hierarchical
fitting of minimum boundary rectangles (MBR) and a RANSAC
based straight line fitting algorithm. Kada and McKinley (Kada
and McKindley, 2009) introduced an approach for the automatic
reconstruction of 3D building models. Again they used existing
building groundfloor plans and LIDAR DSMs. Using building
footprints they decomposed the building shape into sets of non-
intersecting cells, and for each cell the rooftop is reconstructed
by checking the normal directions of the DSM. Tournaire et al.
(Tournaire et al., 2010), developed a stochastic geometry based
on an algorithm to detect building footprints from DSM data
which have less than 1m resolution. They tried to fit rectangles
on the buildings using an energy function and prior knowledge
about buildings. To minimize the energy function, they used a
Reversible Jump Monte Carlo Markov Chain (RIMCMC) sam-
pler coupled with a simulated annealing algorithm which leads
to an optimal configuration of objects. Maas (Maas, 1999) used
maximum slope values in order to determine best fitting rooftype
shapes to generate 3D building models. Valero et al. (Valero et
al., 2008) developed a feature extraction and classification based
method to classify building roofs into two classes as flat-roof
and gable-roof. They estimated ridge-line positions which are
based on skeletons of groundfloor plans. They provided the dif-
ference between the average roof outline height and the average
ridge-line height as first feature, and the norm of the orthorecti-
fied image gradient as second feature for the support vector ma-
chine (SVM) classifier. In all introduced studies, good results
are achieved generally using very high resolution (better than 1
m spatial resolution) DSMs which are generally generated from
airborne images or LASER scan data. However, enhancement of
buildings in low resolution urban DSM data which are generated
from satellite images is still an open research problem. On the
other hand, generally previous approaches require manual extrac-
tion of building outlines or providing groundfloor maps as input.
In order to bring an automated solution to this problem, in previ-
ous work we have proposed a novel technique for obtaining 3D
city representations by applying a building shape and rooftop-
type detection approach to DSMs (Sirmacek et al., 2012). We
started by applying local thresholding to raw DSMs in order to
extract high urban objects which can indicate building locations.
We have extracted building shapes from regions which are ob-
tained from a thresholding result by using a binary active shape
growing algorithm. This methodology depends on growing rect-
angular shapes in elongated segments which are detected in bi-
nary masks obtained by thresholding the DSM. After extracting
the building shapes, we generated 3D models by understanding
the building rooftop-types. Herein, we follow a similar approach
to reconstruct 3D city models, however for active shape growing
we propose a novel approach which uses 3D information in calcu-
lating shape fitting criteria. Using this new method, we increase
the robustness of complex building shape extraction which in turn
increases robustness of 3D reconstruction. Besides introducing a
new methodology, our experiments also provide and insight on
applicabilities of DSMs obtained from different sensors.
2 DETECTING POSSIBLE BUILDING SEGMENTS
FROM DSMS
In this step, we would like to detect approximate building loca-
tions from the DSM before extracting building shapes. If a digital
terrain model (DTM) of the region is available, we could use it
to calculate a normalized digital elevation model (nDEM). In a
nDEM, ground height is referenced to zero, therefore it only pro-
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
vides information about building heights independent from the
height of the terrain. If a nDEM could be calculated, we could
simply threshold it with a constant value in order to obtain high
objects which can represent buildings or trees. In our study, we
segment high objects directly from the DSMs by applying a lo-
cal thresholding. Therefore, the algorithm can be also used for
regions which do not have corresponding DTM data. In local
thresholding, a 100 x 100 pixel size sliding window is used over
the DSM, and a new threshold value is calculated for each region
under the sliding window. This window size is chosen by con-
sidering approximate building sizes in given DSMs of the study
region. However, the thresholding result does not differ signifi-
cantly with slight changes of window size or with slight changes
of input image resolution. Therefore, we can use the same win-
dow size for our input DSMs with different geometric resolutions.
After applying local thresholding to the DSM (D(z, y)), we ob-
tain a binary image (Bp(z, y)) where high objects are labeled
with value 1. We apply labeling to B p(z, y) to obtain its con-
nected components (Sonka et al., 1999). Here each connected
component represents a building segment. If the size of a con-
nected component is less than R pixels we discard it since these
small regions generally correspond to tree clusters. Considering
geometric resolutions of input DSMs, we assume the R value as
100, since building objects cannot be smaller than this pixel size
in our input DSMs. However, this value should be fixed by con-
sidering minimum sizes of the buildings in study regions before
starting to run the algorithm on DSMs. In Fig. 1(a) and (b), we
represent a subpart of the D(z, y) and obtained B p (x, y) thresh-
olding result respectively. Unfortunately, due to the low resolu-
tions or surrounding trees around the building, thresholding result
does not directly represent the building shape. However, it gives
an idea about the approximate shape of the building.
(a) (b) (d)
Figure 1: (a) A sub-part of the original Worldview2 satellite
DSM (D(z, y)), (b) After applying local thresholding (sub-part
of B p (a, y)), (c) Skeleton of the building in the same sub-part of
Bp(z, y), (d) Detected building shape.
In the next step, we use the detected approximate segments to
understand building complexity and to run our 3D active shape
growing method.
3 EXTRACTING BUILDING SHAPES
In a previous study, Sirmacek and Unsalan (Sirmacek and Un-
salan, 2010) proposed an automatic rectangular binary active shape
growing approach (called box-fitting). First they used color in-
variant features to extract possible building rooftop segments.
Mass centers of the rectangular segments are assumed as seed-
points (as approximate building centers). Seed-point locations
are used to grow a virtual active rectangular shape based on an
energy criteria. In previous studies (Sirmacek et al., 2010) and
(Sirmacek et al., 2012), we have used this binary active shape
growing approach to detect complex building shapes from a bi-
nary Bp (x,y) approximate building segment mask. First, We
started by deciding if the building segment is complex or not. If
there are inner yards (holes) inside of the segments, we assumed
them as complex shape. We make this decision by computing
an Euler number on binary building segment (Horn, 1986). If a
buildir
segme
tected
We di
pixels
divide:
els. W
seed-p
ilarly t
20 pix
vided
detail.
For de
metho
growil
active
tion. 1
descril
active
the he
pixel c
erative
if the €
ineque
buildii
our af
we as;
tected.
culate
resent
value.
all 0 €
radian
2012).
appro:
time.
estimz
buildij
of rec
angul:
box-fi
be fot
even i
other ;
these
the pa
E
For «
ity bei
We sir
with {
Ba(a
and ei
disk s
tions
Final
mask.
The d
1.(d).
walls