CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation
For instance, to detect windows on simple buildings, (Han and
Zhu. 2005) integrates rules to produce patterns in image space. In
particular, this approach integrates a bottom-up detection of rect
angles coupled with a top-down prediction hypotheses taken from
the grammar rules. A Bayesian framework validates the process.
(Alegre and Dellaert. 2004) look for rectangular regions with ho
mogeneous aspect by computing radiometry variance. (Müller et
al., 2007) extract an irreducible region to summarize the façade
by periodicity in vertical and horizontal directions. Their results
are significant with façades that effectively contain regular win
dow grid pattern or suitable perspective effects. (Ripperda. 2008)
fixes her grammar rules according to prior knowledge: she be
forehand computes distribution of façade elements from a set of
façade images.
These approaches either use a too restrictive model dedicated to
simple façade layout, or are too specialized for a particular kind
of architecture. They thus would hardly deal with usual Parisian
façades such as Hausmanian buildings or other complex architec
tures with balconies or decoration elements.
Our process works exclusively on a single calibrated street-level
image. Although we could have, we voluntarily did not introduce
additional information such as 3D imagery (point clouds, etc.)
because for some applications such as indexation, image retrieval
and localization, we could just have a single photo acquired by a
mobile phone.
INPUT PROCESS OUTPUT
input image
Vanishing points
Extraction
Planar
Model
Cylindric
Model
Unknown
Model
Figure 1: Our algorithm recursively confronts data with models.
2 OUR MODEL BASED SEGMENTATION STRATEGY if region does not match with any proposed model, we split it.
Most of the aforementioned approaches provide good results
for relatively simple single building. Only a few of them have
addressed very complex façade networks such as the ones en
countered in European cities where the architectural diversity and
complexity is large. Our work is upstream from most of these
approaches: we do not try to extract semantic information but
we just propose a façade segmentation framework that could be
helpful for most of these approaches. This framework must firstly
separate a façade from its background and neighboring façades,
and then, identify intra-façade regions of specific elementary tex
ture models. These regions must be robust to change in scale or
point of view.
Our strategy requires horizontal and vertical image contour align
ments. We thus first need to rectify images in the façade plane:
vertical and horizontal directions in the real world respectively
become vertical and horizontal in the image. To do so, we ex
tract vanishing points which provide an orthogonal basis in object
space useful to resample the image as required.
Regarding segmentation, the core of our approach relies on a re
cursive split process and a model based analysis of each subdi
vided regions. Indeed we do not intend to directly match a model
to the whole façade, but we build a tree of rectangular regions
by recursively confronting data with some basic models. If a re
gion does not match with any of them, it is split again, and the
two sub-regions are analyzed as illustrated by the decision tree
on figure 1. Our models are based on simple radiometric criteria:
planes and generalized cylinders. Such objects are representative
of frequent façade elements like window panes, wall background
or cornices.
We start each process with the whole image region. We test if
its texture matches our planar model. If it does, then the process
stops: we have recognized a planar and radiometrically coherent
region in the image. Otherwise, we test if it matches our gen
eralized cylinder model. In the same manner, the process stops
on the cylinder model. Otherwise the region is not considered
as homogeneous (in the sense of our models) and it is split in
two sub-regions. The process recursively analyzes these two sub-
regions exactly as the same way as the large region. Thus, we
build a segmentation tree whose leaves are planar or generalized
cylinder models. The following sections explain each step of this
algorithm.
3 RECTIFICATION PROCESS
3.1 Extracting Vanishing Points
Our rectification process relies on vanishing point lines detected
by (Kalantari et al., 2008). They project relevant image segments
on the Gaussian sphere: each image segment is associated with
a point on the sphere. Their algorithm relies on the fact that
each circle of such a 3£>-point distribution gathers points asso
ciated with the same vanishing point in the image. Then they
estimate the best set of circles that contains the highest number
of points. The more representative circles are assumed to pro
vide main façade directions: the vertical direction and several
horizontal ones. Figure 2 upper-right shows some detected edges
that support main vanishing points: segments associated with the
same direction are drawn in the same color.
3.2 Multi-planar Rectification Process
We rectify our image in each plane defined by a couple of one
of the horizontal vanishing points and the vertical one. We then
project the image onto the plane. Figure 2 bottom right shows a
rectification result. Figure 2 bottom left shows rectified edges on
the façade plane.
Calibration intrinsic parameters are supposed to be known. Rec
tified image is resampled in grey levels, but such a restriction
already provides some interesting perspectives.