1101
\ Beijing 2008
METHODS FOR IMAGE FUSION QUALITY ASSESSMENT
- A REVIEW, COMPARISON AND ANALYSIS
Yun Zhang
Department of Geodesy and Geomatics Engineering
University of New Brunswick
Fredericton, New Brunswick, Canada
Email - YunZhang@UNB.ca;
Commission VII, WG VII/6
KEY WORDS: Remote Sensing, Digital, Comparison, Fusion, Accuracy
ABSTRACT:
This paper focuses on the evaluation and analysis of seven frequently used image fusion quality assessment methods to see whether,
or not, they can provide convincing image quality or similarity measurements. The seven indexes are Mean Bias (MB), Variance
Difference (VD), Standard Deviation Difference (SDD), Correlation Coefficient (CC), Spectral Angle Mapper (SAM), Relative
Dimensionless Global Error (ERGAS), and Q4 Quality Index (Q4), which were also used in the IEEE GRSS 2006 Data Fusion
Contest. Four testing images are generated to evaluate the indexes. Visual comparison and digital classification demonstrate that the
four testing images have the same quality for remote sensing applications; however, the seven evaluation methods provide different
measurements indicating that the four images have varying qualities. The image fusion quality evaluation by Alparone, et al.,(2004)
and that by the IEEE GRSS 2006 data fusion contest (Alparone, et al.,2007) are also analyzed. Significant discrepancy between the
quantitative measurements, visual comparison and final ranking has been found in both evaluations. The inconsistency between the
visual evaluations and quantitative analyses in the above three cases demonstrate that the seven quantitative indicators cannot provide
reliable measurements for quality assessment of remote sensing images.
1. INTRODUCTION
Image fusion, especially the fusion between low resolution
multispectral (MS) images and high resolution panchromatic
(Pan) images, is important for a variety of remote sensing
applications, because most remote sensing sensors, such as
Landsat 7, SPOT, Ikonos, QuickBird, GeoEye-1, and
WorldView-2, simultaneously collect low resolution MS and
high resolution Pan images. To effectively fuse the MS and Pan
images, numerous image fusion techniques have been
developed with varying advantages and limitations. However,
how to effectively evaluate image fusion quality to provide
convincing evaluation results has been a challenging topic
among the image fusion researchers and users of image fusion
products.
In research publications, the widely used image fusion quality
evaluation approaches can be included into two main categories:
(1) Qualitative approaches, which involve visual
comparison of the colour between original MS and fused
images, and the spatial detail between original Pan and
fused images.
(2) Quantitative approaches, which involve a set of
pre-defmed quality indicators for measuring the spectral
and spatial similarities between the fused image and the
original MS and/or Pan images.
Because qualitative approaches—visual evaluations—may
contain subjective factor and may be influenced by personal
preference, quantitative approaches are often required to prove
the correctness of the visual evaluation.
For quantitative evaluation, a variety of fusion quality
assessment methods have been introduced by different authors.
The quality indexes/indicators introduced include, for example,
Standard Deviation (SD), Mean Absolute Error (MAE), Root
Mean Square Error (RMSE), Sum Squared Error (SSE) based
Index, Agreement Coefficient based on Sum Squared Error
(SSE), Mean Square Error (MSE) and Root Mean Square Error,
Information Entropy, Spatial Distortion Index, Mean Bias Error
(MBE), Bias Index, Correlation Coefficient (CC), Warping
Degree (WD), Spectral Distortion Index (SDI), Image Fusion
Quality Index (IFQI), Spectral Angle Mapper (SAM), Relative
Dimensionless Global Error (ERGAS), Q Quality Index (Q),
and Q4 Quality Index (Q4) (e.g., Wald et al., 1997; Buntilov
and Bretschneider, 2000; Li, 2000; Wang et al., 2002; Piella and
Heijmans, 2003; Wang et al., 2004; Alparone et al., 2004;
Willmott and Matsuura, 2005; Wang et al., 2005; and Ji and
Gallo, 2006). However, it is also not easy for a quantitative
method to provide convincing measurements. A commonly
acceptable evaluation method has not yet been agreed by the
authors of the quantitative evaluation papers.
In the practice of image fusion quality evaluation, it has been
commonly noticed by researchers that the evaluation results can
be affected (1) by the display conditions of the images when
qualitative (visual) evaluation is conducted, and (2) by the
selection of quantitative indicators (indexes) when quantitative
assessment is performed.
• For visual evaluations, if a comparison is not
conducted under the same visualization condition, i.e. if
the images are not stretched and displayed under the
same condition, the comparison will not provide reliable
results. For example, an original MS image usually
appears dark when no histogram stretching is applied,
and it appears significantly differently when different
stretches are applied (examples can be found in Figure
1). These different appearances are not caused by the
quality difference, but just by the conditions of the