Full text: XVIIIth Congress (Part B3)

    
  
  
  
   
  
    
    
   
  
  
   
  
  
  
   
   
    
   
   
   
    
  
    
   
  
    
   
  
   
  
  
  
  
    
   
   
  
   
    
    
    
   
   
  
  
   
  
   
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
   
  
  
   
   
    
  
  
  
  
  
  
  
   
   
  
   
  
    
  
   
     
ted with real-world 
used as opposed to 
xact matches to the 
/pothesis generation 
based object model 
tructural parametric 
the resolution of the 
exible symbolic and 
us constructed. Ad- 
nt relations between 
dary; these are pre- 
olution levels of the 
to translation, rota- 
. Therefore, objects 
ion levels, yet more 
vs fast and efficient 
hly compressed data 
| recognition at any 
rticularly with large 
nages since the size 
asoning operations. 
are recognized at a 
scene, they can be 
ing local operations 
ing a portable stan- 
performed using a 
‚thus making it fully 
s. Experiments in- 
D data: (i) natural 
vering 10 x 20 km? 
| suburban area of 
(iii) various sets of 
cquisition sensor of 
1e same feature de- 
ees were used for all 
dent generic models 
ecognize. In the fol- 
'imental results that 
(ith various types of 
RAL TERRAIN 
‚JENES 
isory elevation data 
various other pho- 
pically stored in the 
)EM) format. How- 
ata of such outdoor 
ning and planning. 
of sensory informa- 
ppropriate compres- 
haracteristics of the 
of the topographic 
the terrain. There- 
oy irregular triangu- 
ic mesh coarsening 
ain features. Such 
ene and for various 
mous visual naviga- 
ation systems (GIS) 
re. 
Using the nearly-planar patches as modeling primitives, the 
detected local surface topographic features can be used to au- 
tomatically segment the region into collections of nearly co- 
planar triangular patches. These patches are grouped using 
generic models to describe interesting, more abstract global 
scene features (e.g., hills, valleys, mountains, plains, etc.) 
which provide a more abstract representation of the scene 
suitable for various reasoning tasks. The original DEM data 
consists of 1200 x 1200 elevation measurements sampled reg- 
ularly at 1 m. intervals around the West-most section of Lake 
Ontario. In Figure 3, the American shore is on the left side 
of the figure and the Canadian shore is on the right side. 
Figure 3 (top) illustrates an initial regular sub-sampling of 
the original triangular mesh representation of the above men- 
tioned scene (uniquely for experimental reasons). The to- 
pographic mesh coarsening and scene feature detection and 
grouping reduces the storage requirements by several orders 
of magnitude. The resulting mesh is irregular in nature with 
more points concentrated around interesting regions with high 
feature density and much less points in flat regions. 
Adopting the nearly-planar patches described above as model- 
ing primitives, this scene was subsequently reduced to approx- 
imately 40 such patches. Generic models can be constructed 
for various global scene features to detect important sym- 
bolic entities. Figure 3 (bottom) shows the symbolic scene 
features identified using generic models based on collections 
of nearly-planar patches. Several topologic representations 
of this symbolic scene description can be formed (e.g., topo- 
logic graphs, entity-relationship diagrams, etc.) to support 
practical symbolic reasoning and planning tasks. 
  
Figure 3: A triangular mesh representation of the ter- 
rain (top) with the detected nearly-planar patches (bottom). 
  
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996 
5 RECOGNITION OF MAN-MADE STRUCTURE 
IN AERIAL IMAGES 
For the experimental validation of our terrain model, we use 
a set of geographic elevation data captured by aerial imaging 
and covering ari area of roughly 240x240 meters with a one 
meter resolution. This sensory data covers a plain with several 
buildings of similar heights. For the experiments, we select 
subregions with various resolutions and sizes, and therefore, 
we generate test data covering a wide variety of scenes. For 
simplicity and without loss of generality, we use rectangu- 
lar subsections of the terrain for the individual experiments. 
This choice does not bias nor affect the validity of the re- 
sults. Figure 4 depicts the original range image of the entire 
region. The grey-scale format (Figure 4-top) is such as that 
the darker the pixels the higher their corresponding elevations. 
It also illustrates the detected houses (Figure 4-bottom) in a 
triangular mesh representing the full details of the scene. 
Our building model is expressed in terms of a set of nearly- 
planar patches as described earlier. Some of these (corre- 
sponding to side walls) are far from horizontal and enclose 
other raised patches (corresponding to roofs) which are close 
to horizontal. Such a flexible model is very generic and able 
to extract numerous other objects such as buses and vans if 
they exist in the aerial scene. Therefore, we use some domain 
knowledge and context constrains on the object's dimensions 
to exclude spurious objects. If we were to recognize outdoor 
vehicles using this generic model, only the parameters of the 
constraints defining the model need to be changed. 
We use a single generic model covering houses, apartment 
buildings, and large hangars. When such a structure is recog- 
nized, a set of derived parameters (e.g., height, area of floor 
plan, volume, area of enclosing surface) are computed. They 
are used to distinguish the different objects using domain- 
specific knowledge (e.g., the average size of a house compared 
to a high apartment building). The derived parameters are 
also used to reconstruct the detected objects' geometry if re- 
quired. Table 5 provides the obtained results for the scene in 
question. The house labels are consistent with those labels in 
Figure 4-top. The values reported here were computed using 
the high resolution mesh sampled at 2 meters intervals. The 
center of gravity (C.O.G) of each detected house is reported 
in meters with respect to the origin at the top left corner in 
Figure 4. Surfaces are given in square meters. 
We verified the capability of our man-made structure recog- 
nition system at various resolution levels of the available sen- 
sory range data. Therefore, we applied our topographic mesh 
coarsening algorithm mentioned earlier to the original dense 
mesh representing this scene. After several coarsening it- 
erations, the number of vertices and triangles representing 
the same scene decreases significantly. The coarser mesh 
was then used to identify the same houses using the same 
model described above. The houses were detected with a 
high accuracy (within the allowable errors in mesh coarsening) 
and yielded almost identical results to the houses detected 
in the original dense mesh. Figure 6 illustrates a close-up 
to the coarse mesh in the neighborhood of the houses la- 
beled 10 and 11 in Figure 4. It is clear from the table at 
the bottom of Figure 6 that the results obtained from such 
a coarser mesh are nearly identical to those obtained from 
the original high resolution mesh. The maximum error in the 
location of the house's center of gravity is 0.2 m only. This 
amounts to about than 0.08394. 
189 
  
SEEN 
A 
EM 
  
  
id 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.