International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
EVALUATION OF PENALTY FUNCTIONS FOR SEMI-GLOBAL MATCHING
COST AGGREGATION
Christian Banz, Peter Pirsch, and Holger Blume
Institute of Microelectronic Systems
Leibniz Universität Hannover, Hannover, Germany
{banz,pirsch,blume}@ims .uni-hannover.de
KEY WORDS: Stereoscopic, Quality, Matching, Vision, Reconstruction, Camera, Disparity Estimation, Semi-Global Matching
ABSTRACT:
The stereo matching method semi-global matching (SGM) relies on consistency constraints during the cost aggregation which are
enforced by so-called penalty terms. This paper proposes new and evaluates four penalty functions for SGM. Due to mutual depen-
dencies, two types of matching cost calculation, census and rank transform, are considered. Performance is measured using original
and degenerated images exhibiting radiometric changes and noise from the Middlebury benchmark. The two best performing penalty
functions are inversely proportional and negatively linear to the intensity gradient and perform equally with 6.05 % and 5.91 % average
error, respectively. The experiments also show that adaptive penalty terms are mandatory when dealing with difficult imaging condi-
tions. Consequently, for highest algorithmic performance in real-world systems, selection of a suitable penalty function and thorough
parametrization with respect to the expected image quality is essential.
1 INTRODUCTION
Calculating depth information by stereo matching (disparity esti-
mation) is a common image processing task in many remote sens-
ing applications. Typical applications of range cameras based
on stereo imaging include advanced driver assistance systems,
robotics, and keyhole surgery assistance systems. Crucial aspects
for real-world suitability is accuracy and density of the depth
map, which are especially difficult to achieve at in untextured
areas. These requirements are further impacted by noise and dif-
ficult lighting conditions. Naturally, all of these effects occur in
real-world scenarios.
The semi-global matching algorithm (SGM) (Hirschmüller, 2008)
is among the top-performing algorithms in the ongoing Middle-
bury benchmark (Scharstein and Szeliski, 2012). The benchmark
originated from the studies in (Scharstein and Szeliski, 2002)
comparing state-of-the-art stereo methods using a controlled set
of test images with complex scene structure and varying texture.
It has also been shown that SGM is able to effectively deal with
the aforementioned issues (Hirschmüller and Scharstein, 2009).
Several combinations of matching cost functions and stereo meth-
ods were evaluated using original and degraded test images (e. g.
noise, exposure differences).
Furthermore, it has recently been shown that SGM can be im-
plemented in real-time on a variety of platforms. For example,
an FPGA implementation (Banz et al., 2011b) and a GPU imple-
mentation (Banz et al., 2011a) both reach over 60 fps for VGA
images with 128 pixel disparity range. The high algorithmic per-
formance and real-time capability make SGM very attractive for a
wide range of applications including low power embedded vision
systems and desktop system with off-the-shelf hardware.
Of major relevance to the performance are the smoothness con-
straints that are imposed by SGM during the cost aggregation
step. These constraints are adapted to the image content by means
of so-called penalty functions which penalize abrupt changes in
the depth information when, according to image content, a change
of objects is unlikely. Therefore, the choice of penalty functions
has a significant influence on the algorithmic performance and
robustness. Despite the many surveys on SGM, the influence of
the penalty functions has not yet been investigated.
In this paper, new penalty functions for the cost aggregation step
of SGM are proposed and evaluated. Due to the mutual depen-
dency of matching cost function and penalty function, two match-
ing cost functions for initial correspondence hypothesis are con-
sidered. These are based on the rank transform and the census
transform (Zabih and Woodfill, 1994), both of which are often
used in systems for disparity estimation due to their good perfor-
mance and efficient implementation possibilities. Each penalty
function is parametrized for both matching cost functions us-
ing the established data sets with ground truth disparities from
(Scharstein and Szeliski, 2002) with and without additional con-
trolled radiometric changes of intensity similar to (Hirschmüller
and Scharstein, 2009) as well as noise. Evaluation is performed
in terms of, firstly, accuracy and density of the disparity map and,
secondly, the insensitivity to the degraded input images.
Section 2 reviews algorithmic background on semi-global match-
ing and disparity estimation. Section 3 details the methodology,
experiments and results for the different test sets. Conclusions
are drawn in Section 4.
2 STEREO MATCHING
It is important to distinguish between the initial a similarity mea-
sure (matching costs) between two pixels in the base and match
image (or left and right image, respectively) and the aggregation
method that uses these costs. In this work, rank transform and
census transform (Zabih and Woodfill, 1994) are considered as
matching costs functions and semi-global matching (Hirschmüller,
2008) is used for cost aggregation. Final disparity selection is
performed by a winner-take-all (WTA) approach.
2.1 Rank Transform
Matching costs C'(p, d) based on the rank transform (RT) of the
base and match image Rp and R,, are calculated as
C (p.d) — |Fs (pz, y) — Rm (ps — d. )| (D