Full text: Proceedings (Part B3b-2)

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B3b. Beijing 2008 
714 
2. SYSTEM OVERVIEW 
Figure 1 shows the main components in our system. The 
epipolar images are generated from the aerial images by 
epipolar resampling process. We obtain the disparity map 
between the epipolar pairs by stereo matching using area-based 
matching with non-parametric technique. From the disparity 
map, we generate the DEM as a 3D terrain model. The building 
location information extracted from disparity map is used to 
remove the unnecessary line segments extracted in the low level 
process. After 2D lines are generated, perceptual grouping is 
applied to the filtered line segments in order to obtain the 
structural relationship features such as parallel line segment 
pairs and U-shapes. These can be used to generate rooftop 
hypotheses. Among the generated hypothesis, the candidate 
rooftop is selected by searching close cycles in the undirected 
graph. Finally, we retrieve 3D buildings by using 3D 
triangulation for each line segment of detected rooftops. 
Figure 1. System Overview 
3. BUILDING REGION EXTRACTION 
3.1 Stereo Matching 
To find accurate disparity map, we employed a multi-resolution 
scheme, referred to as hierarchical, or pyramid processing. For 
each resolution scheme, the correspondence problem is solved 
by first computing census transformed image and then using 
Hamming distance correlation on the transformed image. The 
census transformation maps the local region surrounding a pixel 
to a bit string represent which pixels have lesser intensities. For 
example, in a window surrounding a pixel, if a particular pixel’s 
value is less than the centre pixel, the corresponding position in 
the bit string will be set to 1, otherwise it is set to 0. After that, 
two census transformed images will be compared using a 
similarity metric based on the Hamming distance which is the 
number of bits that differ in the two correlation window bit 
string. The Hamming distance (Banks, 1997) is summed over 
the window: 
Hammi(I ] (u, v), I 2 (x + u,y + v)) (1) 
(u,v)elF 
where /, , / 2 represent the census transforms of /, and / 2 , 
W is the correlation window. 
3.2 Suspected Building Region Extraction 
It is usually difficult to separate interested objects from 2D line 
segments collection obtained in low level features extraction. 
The boundary of interested objects, the buildings, can be partly 
occluded by vegetation, shadows, and other objects. In rooftop 
hypothesis process, these fragmented boundaries and the 
presence of roads, vehicles ... can make false hypotheses 
including unwanted rooftop and wrong shape rooftop. This 
causes not only significant computational effort in processing 
but also wrong final results. To solve this problem, the system 
should be able to detect line segments that are within or near 
buildings in the image. Here, we use suspected building regions 
which extracted from the disparity map. The suspected building 
regions are areas which pixel values changes in comparison 
with the surround area. The different of pixel values between 
suspected building region and surround areas indicates the 
different of elevation values. It indicates the existing of higher 
objects such as buildings, trees ... in those regions. In the other 
words, these regions can give us the information of where the 
buildings are located. 
The goal of stereo matching process is to find a match between 
the pixels in the first (reference) R and second (wrap) W image 
such that the pixel located at (i, j) in the R image and a pixel 
located at (i+I(i, j), j+J(i, j)) in the W image view the same 
point in object space, i.e., W(i+I(i, j), j+J(i, j)) -> R(i,j), where 
I(i, j) is horizontal disparity map, and J(i, j) is vertical disparity 
map. The index i (column index) is measured along scan lines 
and the index j (row index) is measured across scan lines. In 
this paper, we use epipolar resampled images, and J(i, j) = 0 
for all i and j. This relation can be reduced to W(i+I(i, j), j) -> 
wj). 
Considering the correspondence problem, there are two popular 
approaches. The first one is Normalized Cross Correlation 
(NCC) which is one of area-based matching typical metric, and 
the second one is non-parametric technique with census 
transform (Zabih, 1998). We employ the census transform, due 
to its preservation of the edges and computational simplicity. 
These regions could be extracted by using a simple height 
threshold technique. Their boundaries are extracted by 
convolving the disparity map with a Laplacian-of-Gaussian 
filter then employing connected component analysis to get zero 
crossing pixels’ coordinate in the convolution output. We have 
LoG as an operator or convolution kernel defined as: 
LoG(x,y) = AG a (x,y) 
d 2 
(2) 
- TGAx,y) + -rjG a (x,y) 
ox dy 
KG 
1- 
2 2 
X + y 
2 cr 
* +y 
2ct 2
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.