Full text: CMRT09

CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation 
128 
(a) Prior Building Models (<£>i j): i determines the shape of the footprint 
and j the roof type 
(b) The family Iq j which has a rectangular footprint (i = 1). 
(c) Building’s main height h m and roofs height 
h r (x,y) 
Figure 2: Hierarchical Grammar-Based 3D Prior Models. The 
case of Building Modeling: Building’s footprint is determined 
implicitly from the Eid- h m and h r (x,y) are recovered for ev 
ery point {E$d) and thus all the different type of roofs j are mod 
eled. 
most appropriate model and then determine the optimal set of 
parameters aiming to recover scene’s geometry (Figure 1). The 
proposed objective function consists of two segmentation terms 
that guide the selection of the most appropriate typology and a 
third DEM-driven term which is being conditioned on the typol 
ogy. Such a prior-based recognition process can segment both 
rural and urban regions (similarly to (Matei et al., 2008)) but is 
able, as well, to overcome detection errors caused by the mislead 
ing low-level information (like shadows or occlusions), which is 
a common scenario in remote sensing data. 
Our goal was to develop a single generic framework (with no 
step-by-step procedures) that is able to efficiently account for 
multiple 3D building extraction, no matter if their number or 
shape is a priori familiar or not. In addition, since usually for 
most sites multiple aerial images are missing, our goal was to 
provide a solution even with the minimum available data, like a 
single panchromatic image and an elevation map (produced either 
with classical photogrammetric multi-view stereo techniques ei 
ther from LIDAR or INSAR sensors), contrary to approaches that 
were designed to process multiple aerial images or multispectral 
information and cadastral maps (like in (Suveg and Vosselman, 
2004),(Rottensteiner et al., 2007),(Sohn and Dowman, 2007)), 
data which much ease scene’s classification. Doing multiview 
stereo, using simple geometric representations like 3D lines and 
planes or merging data from ground sensors was not our interest 
here. Moreover, contrary to (Zebedin et al., 2008), the proposed, 
here, variational framework does not require as an input dense 
height data, dense image matching processes and a priori given 
3D line segments or a rough segmentation. 
2 MODELING TERRAIN OBJECTS WITH 3D PRIORS 
Numerous 3D model-based approaches have been proposed in lit 
erature. Statistical approaches (Paragios et al., 2005), aim to de 
scribe variations between the different prior models by measuring 
the distribution of the parameter space. These models are capable 
to model building with rather repeating structure and of limited 
complexity. In order to overcome this limitation, methods using 
generic, parametric, polyhedral and structural models have been 
considered (Jaynes et al., 2003),(Kim and Nevatia, 2004),(Su 
veg and Vosselman, 2004),(Dick et al., 2004),(Wilczkowiak et 
al., 2005),(Forlani et al., 2006),(Lafarge et al., 2007). The main 
strength of these models is their expressional power in terms of 
complex architectures. On the other hand, inference between the 
models and observations is rather challenging due to the impor 
tant dimension of the search space. Consequently, these models 
can only be considered in a small number. More recently, proce 
dural modeling of architectures was introduced and vision-based 
reconstruction in (Muller et al., 2007) using mostly facade views. 
Such a method recovers 3D using an L-system grammar (Muller 
et al., 2006) that is a powerful and elegant tool for content cre 
ation. Despite the promising potentials of such an approach, one 
can claim that the inferential step that involves the derivation of 
models parameters is still a challenging problem, especially when 
the grammar is related with the building detection procedure. 
Hierarchical representations are a natural selection to address com 
plexity while at the same time recover representations of accept 
able resolution. Focusing on buildings, our models involve two 
components, the type of footprint and the type of roof (Figure 2). 
Firstly, we structure our prior models space by ascribing the 
same pointer i to all models that belong to the family with the 
same footprint. Thus, all buildings that can be modeled with a 
rectangular footprint are having the same index value i. Then, 
for every family (i.e. every i) the different types of building tops 
(roofs) are modeled by the pointer j (Figure 2b) Under this hierar 
chy <E>i,j, the priors database can model from simple to very com 
plex building types and can be easily enriched with more complex 
structures. Such a formulation is desirously generic but forms a 
huge search space. Therefore, appropriate attention is to be paid 
when structuring the search step. 
Given the set of footprint priors, we assume that the observed 
building is a homographic transformation of the footprint. Given, 
the variation of the expressiveness of the grammar, and the de 
grees of freedom of the transformation, we can now focus on the 
3D aspect of the model. In such a context, only building’s main 
height hm and building’s roof height h r (x, y) at every point need 
to be recovered. The proposed typology for such a task is shown 
in Figure 2. It refers to the rectangular case but all the other 
families can respectively be defined. More complex footprints, 
with usually more than one roof types, are decomposed to sim 
pler parts which can, therefore, similarly recovered. Given an im 
age J(x, y) at domain (bounded) il E i? 2 and an elevation map 
7i(x, y) -which can be seen both as an image or as a triangulated 
point cloud- let us denote by h rn the main building's height and 
by P m the horizontal building’s plane at that height. We proceed 
by modeling all building roofs (flat, shed, gable, etc.) as a combi 
nation of four inclined planes. We denote by Pi, P2, P3 and P4 
these four roof planes and by , U2, u>3 and u>4, respectively, the 
four angles between the horizontal plane h m and each inclined 
plane (Figure 2). Every point in the roof rests strictly on one of 
these inclined planes and its distance with the horizontal plane is 
the minimum compared with the ones formed by the other three 
planes. 
With such a grammar-based description the five unknown param 
eters to be recovered are: the main height h m (which has a con 
stant value for every building) and the four angles u. In this way 
all -but two- types of buildings tops/roofs can be modeled. For 
example, if all angles are different we have a totally dissymmetric 
roof (Figure 2b - $1.5), if two opposite angle are zero we have a
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.