The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Voi. XXXVII. Part Bl. Beijing 2008
1093
4 Calculating a normalized digital elevation model (nDEM)
5 Creating true orthophotos
6 Classification
7 Object extraction
8 Object modeling
9 Representing the object models through geometric
primitives and exporting in a suitable 3D format
3.1 Preprocessing of the raw imagery
The Ikonos images are accompanied by rational polynomial
coefficients (RPCs) describing the sensor model, orbit, and
attitude data. These 80 coefficients together with 10 scale and
offset parameters describe rational polynomial functions
linking the geographical coordinates latitude, longitude and
height above WGS84 ellipsoid with the pixel coordinates of
each image (Jacobsen et al., 2005, Grodecki et al., 2004).
Unfortunately the absolute positioning of the RPCs in the case
of Ikonos is only correct within a range of 10 to 50 m. Due to
this in the preprocessing step a relative correlation of the two
images has to be guaranteed. Therefore the two stereo images
undergo an image matching process that delivers correlated
points in the two images. With the knowledge of the two pixel
coordinates in both images and the requirement of the same
absolute height of each correlated point pair one of the RPCs
can be corrected by minimizing the residuals to fit the other
(Lehner et. al., 2007). In the case of Ikonos images this
correction is mostly only a simple shift.
Also a pan sharpened image pair will be generated from the
pan channel and the quarter resolution multispectral channels.
3.2 Creating the digital surface model (DSM)
The most crucial step in the processing chain is the generation
of a rather good digital surface model from the optical VHR
stereo image pair. For this task various methods where
analyzed and rated for usability for such imagery. The four
evaluated methods were:
• Digital line warping, “DLW” (Krauß et al., 2005)
• Semi global matching, “SGM” (Hirschmüller, 2005)
• GraphCut (Collins, 2004)
• Standard (Lehner and Gill, 1992)
In a first approach for the generation of the DSM from a stereo
image pair the so called “standard” approach was analyzed. It
was developed for the generation of digital surface models of
images from the DLR three line scanner camera MOMS
(MOMS, 1998) flown on the MIR space station. The method is
based on a classical area-based matching relying on extracted
interest points and an optimized region growing. In urban
situations containing many steep edges and relatively large
incidence angles - as used in the standard stereo products of
the satellite imagery providers - only a small amount of usable
3D-points remain due to large occlusions.
So dense stereo approaches like dynamic line warping and
semi-global matching were also implemented and analyzed for
inclusion in the automatic processing chain. Such dense stereo
methods depend however on strict epipolar geometry. A good
overview of a selection of such algorithms is given on the
Stereo Vision Research Page of the Middlebury College
maintained by Daniel Scharstein and Richard Szeliski
(Scharstein and Szeliski, 2008).
Digital line warping is based on the application of a speech
recognition algorithm based on dynamic programming to
coregistrated image lines in epipolar direction. Two epipolar
lines of the two stereo images are correlated respectively and
local distortions along the lines are calculated which lead to
the local parallaxes. Due to only correlating the images line by
line this method suffers from missing inter-line information
which results in line streaking effects along the epipolar line.
Figure 4. Top (left to right): digital surface models calculated
with the methods DLW and SGM, bottom: calculated with
GraphCut and Standard
The semi global matching is an extension to this approach. In
this case not only two single lines of the images get compared
but the energy function used by the dynamic programming
integrates additional information from the whole image (“semi
global”). This extension leads to much less streaking effects
but also increased processing time.
The GraphCut algorithm is based on a description of the
“matching space” (all 3D points of the scene) by a discrete
mathematical graph with nodes at each (x,y,z) coordinate and
rectangular edges connecting these nodes. The calculation of a
“maximum flow” through this graph gives the correlated
“minimum cut” which represents the searched surface. This
method lacks in the generation of sub pixel height levels which
means that a finer height resolution needs a more complex
graph and hence exploding processing time.
The analysis of all algorithms with respect to a given ground
truth defined by a laser DSM of the Munich scene leads to the
following ranking of the methods (Pentenrieder, 2008):