Content uploaded by Ismail Avcibas
Author content
All content in this area was uploaded by Ismail Avcibas on Dec 11, 2014
Content may be subject to copyright.
STATISTICAL ANALYSIS OF IMAGE QUALITY MEASURES
İsmail Avcıbas, Bülent Sankur
Department of Electrical and Electronic Engineering, Bogaziçi University, İstanbul, Turkey
avcibas@busim.ee.boun.edu.tr sankur@boun.edu.tr
ABSTRACT
In this paper we conduct a statistical analysis of the sensitivity
and consistency behavior of objective image quality measures.
We categorize the quality measures and compare them for still
image compression applications. The measures have been
categorized into pixel difference-based, correlation-based,
edge-based, spectral-based, context-based and HVS-based
(Human Visual System-based) measures. The mutual
relationships between the measures are visualized by plotting
their Kohonen maps. Their consistency and sensitivity to
coding as well as additive noise and blur are investigated via
ANOVA analysis of their scores. It has been found that
measures based on HVS, on phase spectrum and on
multiresolution mean square error are most discriminative to
coding artifacts.
1 INTRODUCTION
Image quality measures are figures of merit used for the
evaluation of imaging systems or of coding/processing
techniques. Our main goal in this study is to investigate
the statistical discriminative power of several quality
measures to distortion due to compression, additive
noise and blurring. We determine the commonalities
between these measures with a view to ultimately extract
and combine a subset of measures which will satisfy
most of the image quality desiderata [1,2,3].
We consider in this paper 6 categories of distortion
measures, namely: a) pixel difference-based, b)
correlation-based, c) edge-based, d) spectral-based, e)
context-based and f) HVS-based (Human Visual
System-based) measures. We actually compute
numerical scores for a great variety (actually 30) of these
measures. In our comparisons of the image quality
measures for compression, we used the two well-known
compression algorithms; the DCT-based JPEG [6] and
wavelet-based method, Set Partitioning in Hierarchical
Trees (SPIHT) algorithm [7].
2 IMAGE QUALITY MEASURES
The formulae of image quality measures are given in
Table A. We denote the multispectral components of an
image at the pixel position i, j, and in band k as
()
jiC
k
,,
where
Kk ,...,1= and i, j = 1, ...N. The boldface
symbols
()
ji,C and
()
ji,
ˆ
C indicates the multispectral
pixel vectors, respectively, of the original and distorted
image. C itself denotes a generic K-band image.
2.1 Measures Based on Pixel Difference
In the pixel difference based measures D1-D8 in Table
A, the first four correspond to, respectively, the Mean
Square Error, Error in the L*a*b* space, Minkowsky
Measure and Maximum Difference. The noise-prone
nature of the maximum difference metric can be
mitigated by ranking the pixel vector differences in
descending order and by considering the r.m.s. of the
largest r ones. D6 is the Czenakowski distance. D7 is
the difference over a neighborhood, and D8
multispectral multiresolution distance measure.
2.2 Correlation Based Measures
Three correlation measures often referred in the
literature [9] are: Structural Content C1, Normalized
Cross Correlation C2, and Image Fidelity C3. A variant
of correlation based measures can be obtained by
considering the absolute mean and variance statistics,
C4, C5, of the angles between the pixel vectors of the
original and coded images.
2.3 Edge Quality Measures
In the perception of scene contents, edges play the
dominant role. We first used the measure distance
introduced by Pratt (E1). The other measure is edge
stability measure (E2), which is defined as the
consistency,
()
jiQ , , of edge evidences across different
scales in both the original and coded images. The third
measure in this category that we consider is the surface
properties, i.e. mean and Gaussian curvatures (E3).
2.4 Spectral Distance Measures
The distortion occurring in the phase and magnitude
spectra
()
),(arctan),( vuvu Γ=
ϕ
, ),(),( vuvuM Γ= are
indicated in the spectral magnitude S1, spectral phase S2
and combined S3 measures. (
),( vu
k
Γ
and ),(
ˆ
vu
k
Γ
denote the k'th band spectra). Alternatively the spectral
distortion can be calculated over the transforms taken
over the blocks of the image bands. A global quality
metric can be obtained by finding a statistics of the
block-based distortions. Distortion measures S4, S5, S6
have been obtained as the median of block distortions
(magnitude, phase, and combined penalty figures of
l
M
J ,
l
J
φ
,
l
J ). Block sizes of 8 to 32 have been found
adequate.
2.5 Context Measures
Most of the compression algorithms and computer vision
tasks are based on the neighborhood information of the
pixels. In this sense any loss of contextual information
could be a good measure of image quality. Such
statistical information lies in the context probability, that
is the p.m.f (
p) of pixels in a neighborhood. Changes in
the context probabilities can be used to track loss in
quality.
The high dimensional (at least 9-12) p.m.f. is estimated
by judicious usage of kernel estimation and cluster
analysis for multispectral images [8]. Once the p.m.f’s
are obtained, rate distortion-based
R(p) (X1), and various
f-divergence-based measures X2, X3, X4 can be
obtained for image quality measurement.
2.6 Human Visual System Based (HVS) Measures
The incorporation of even a simplified HVS model into
objective measures reportedly [1, 2, 10] leads to a better
correlation with the subjective ratings. Let the human
visual system is modeled as a band-pass filter, [9], with a
transfer function in polar coordinates
[]
ï
î
ï
í
ì
≥
<
=
−−
7
705.0
)(
3.2
1010
554.0
9loglog9
ρ
ρ
ρ
ρ
ρ
e
e
H
where
()
2/1
22
vu +=
ρ
, u and v being the spatial
frequencies. Both the original and coded images are
preprocessed via this filter to simulate the HVS effect.
The image operation of multiplying the DCT of the
image by the spectral mask above, and inverse DCT
transforming is denoted by the
{}
⋅U operator. in H1-H3.
Some possible measures for the multispectral images are
given, H1, H2 and H3. The multiscale model H4 is too
detailed to be explicated [10] but it includes channels,
which account for perceptual phenomena such as color,
contrast, color-contrast and orientation selectivity. From
these channels, features are extracted and then an
aggregate measure of similarity using a weighted linear
combination of the feature differences is formed.
3 RESULTS AND STATISTICAL ANALYSIS
3.1 Data and Methods
In our comparison of image quality measures we used
ten multispectral satellite images compressed at five
different bit rates with the DCT-based JPEG, and
wavelet-based SPIHT compressors. The selected bit
rates were 0.35, 0.65, 0.85, 1.30 and 1.95 bits/pixel
experimentally determined to reflect five categories of
quality.
3.2 Discriminative Power of The Measures
Since image quality scores overall were approximately
normally distributed Analysis of Variance (ANOVA)
could be used to analyze the results. Using both box
plots and ANOVA
F-tests, we have compared the groups
of compressed images at different bit rates. On one hand
box plots of scores with a sharp slope and little overlap
and on the other hand high
F scores in the ANOVA tests
are both indicative of a good measure of quality. The
ANOVA results of the image quality measures (out of 30
tested) in each category for the pooled data obtained
from JPEG and SPIHT compression algorithms are
given in Table 1.
This analysis aims to identify quality measures that are
sensitive in a consistent way to image quality variations
due to compression, (F scores with respect to bit rate
BR-F), and to the type of coders employed, (F scores
with respect to compressor type CT-F). The main
findings from ANOVA analysis and box plots are as
follows:
1) In each of the six categories the following measures
were found to be the most sensitive: Multiresolution
Distance measure (D8) in the pixel difference group;
Image Fidelity measure (C3) in the image similarity
group; Edge Stability measure (C3) in the edge
distortion group; Weighted Spectral Distance measure
(S3) in the spectral distortions group; Matusita distance
(X4) in the context group;
2
L error with HVS filtering
(H3) in the human visual system group.
Table 1. Two-way ANOVA results for different bit
rates (BR-F), and types of coder, (CT-F).
Measure
BR-F CT-F
Measure
.
BR-F CT-F
D1 18.0 1.22 E3 2.9 0.07
D2 20.1 1.37 S1 21.5 0.09
D3 18.9 0.33 S2 180.8 6.41
D4 20.9 0.05 S3 212.9 5.75
D5 19.3 0.19 S4 22.3 0.27
D6 27.4 0.49 S5 101.2 1.44
D7 10.2 0.24 S6 101.2 1.40
D8 125.5 47.81 X1 9.4 0.94
C1 10.2 1.82 X2 7.8 0.25
C2 19.5 0.61 X3 7.6 0.26
C3 25.3 1.79 X4 10.9 0.47
C4 19.1 0.51 H1 34.5 11.88
C5 8.7 0.01 H2 33.6 23.17
E1 4.1 0.07 H3 497.9 187.3
E2 9.0 0.23 H4 331.0 228.9
2) In particular if one ranks all the 30 measures with
respect to their F statistics the three most reliable image
quality measures appear to be the Weighted Spectral
Distance, the HVS filtered
2
L measure and the
Multiresolution Distance Measure.
3.3 Relationship Between Quality Measures
We have investigated the correlation between quality
measures to explore how similarly they respond to
compression artifacts, and how they are positioned in a
self-organizing map. Self-Organizing Maps (SOM) can
be useful for the visual assessment of similarities and
correlation present in these measures.
The SOM of quality measures is obtained by processing
vector measurements on different images. More
specifically for each measure we calculate a 10-
dimensional vector (since ten images were considered).
Furthermore there are five such vectors, one for each bit
rate considered. The main conclusions from the
observation of SOM’s and their correlation are:
1) The clustering tendency in pixel difference-based
measures and correlation measures in the lower right
corner of Figure 1, (D3, D6, D7, D4, D5 and C1, C2,
C4), is not surprising, since they are similar to each
other. The smaller the pixel differences are, the higher
the correlation between the uncompressed and the
compressed images should be.
2) Similarly spectral magnitude measures are correlated
with pixel difference or correlation based measures (S1,
S4, D3, D1, C2) as in the lower right of the map as to be
expected from Parseval’s energy preservation theorem.
3) Spectral phase based measures, (S2, S3, S5), take
place in the upper right corner of the map and they are
drastically different.
4) The multiresolution distance measure, D8, is well-
correlated with HVS based measures (H1, H4, H2), since
the idea behind this measure is to mimic image
comparison by eye more closely, by assigning larger
weight to low resolution components and less to the
detailed high frequency components.
Figure 1. SOM map of image quality measures for the
pooled data obtained from JPEG and SPIHT
compression algorithms.
5) The second category of measures highly correlated
with HVS-based measures is the context probability
based measures (H1, H2, H4, X2, X3, X4). The reason
behind this fact is that the interband correlation along
with contextual information is both taken into account in
the computation of these measures.
6) The proximity between the Pratt measure (E1) and the
maximum difference measures (D4, D5) is meaningful,
since the maximum distortions in reconstructed images
are expected to be near the edges. The constrained
maximum distance or sorted maximum distance
measures can be used in coder designs to preserve the
two dimensional features, such as edges, in reconstructed
images.
7) The spectral phase measures are observed to stand
apart from almost any other measure. It is also known
that phase information in images is more important
under certain circumstances. We have observed that the
spectral phase measures possess high sensitivity and
discriminative power to coding artifacts. In fact the
highest
F and Q scores in Table 1 were found with the
spectral measures. Thus spectral phase measure
deserves more attention in the design of compression
algorithms.
4 CONCLUSIONS
In this work we have presented collectively the major
image quality measures in their multispectral version and
classified them into six categories. The Kohonen map of
the measures when the respective feature vectors were
their performance figures at various compression ratios
has been useful in identifying measures that behave
similarly, and conversely in identifying the ones that are
sensitive to different distortion artifacts in compressed
images.
Furthermore statistical investigation of the 30 different
measures using a two-way ANOVA analysis has
revealed that HVS-based measures (H3, H4) are the
most sensitive to coding artifacts while being less
dependent on image variety. Other measures that are
close competitors to HVS-based measures are
multiresolution distance measure (D8) and spectral
phase-based measures (S2, S3).
In conclusion multiresolution distance measure and / or
spectral phase measure should be paid more attention in
the design of compression algorithms.
5 REFERENCES
[1] A. B. Watson, Ed., Digital Images and Human Vision,
Cambridge, MA: MIT Press, 1993.
[2] M. P. Eckert, A. P. Bradley, “Perceptual quality metrics
applied to still image compression”, Signal Processing,
(70), 177 – 200, (1998).
[3] I. Avcibas, B. Sankur, K. Sayood, “Statistical Evaluation
of Quality Measures In Image Compression”, J. of
Electronic Imaging, (in review).
[4] C. Rencher, Methods of Multivariate Analysis, New York,
Wiley, 1995.
[5] T. Kohonen, Self-Organizing Maps. Springer-Verlag,
1995.
[6] G. K. Wallace, “The JPEG Still Picture Compression
Standard”, IEEE Trans. Cons. El., 38(1), 18-34, 1992.
[7] A. Said, W. Pearlman, “A New Fast and Efficient Image
Codec Based on Set Partitioning in Hierarchical Trees”,
IEEE Trans. Circuits and Systems for Video Technology,
6(3), 243-250, 1996.
[8] K. Popat, R. Picard, “Cluster Based Probability Model
and It’s Application to Image and Texture Processing”,
IEEE Trans. Image Process., 6(2), 268-284, 1997.
[9] M. Eskicioglu, P. S. Fisher, “Image Quality Measures and
Their Performance”, IEEE Trans. Commun., 43(12),
2959-2965, 1995.
[10] T. Frese, C. A. Bouman and J. P. Allebach, “Methodology
for Designing Image Similarity Metrics Based on Human
Visual System Models”, Proceedings of SPIE/IS&T
Conference on Human Vision and Electronic Imaging II,
Vol. 3016, San Jose, CA, 472-483, 1997.
Appendix A
Table A: Expressions for the distortion measures.
Pixel Difference Based Measures
2
1
0,
2
),(
ˆ
),(
11
1
å
−
=
−=
N
ji
jiji
N
K
D CC
() () ()
[]
å
−
=
∆+∆+∆=
1
0,
2
*
2
*
2
*
2
,,,
1
2
N
ji
jibjiajiL
N
D
() ()
γ
=
−
=
γ
åå
ï
ï
ý
ü
ï
î
ï
í
ì
−=
/1
K
1k
1N
0j,i
kk
2
j,iC
ˆ
j,iC
N
1
K
1
3D
() ()
å
=
∞
∆
−==
K
k
kk
ji
jiCjiC
K
LD
1
,
,
ˆ
,
1
max4
()
å
=
−∆=
r
m
kk
m
r
D
1
2
ˆ
1
5 CC
() ()
()
() ()
()
ååå
−
===
÷
÷
÷
ö
ç
ç
ç
è
æ
+
÷
÷
÷
ö
ç
ç
ç
è
æ
−=
1
0, 11
2
,
ˆ
,/,
ˆ
,,min21
1
6
N
ji
K
k
kk
K
k
kk
jiCjiCjiCjiC
N
D
() ()
()
() ()
()
[]
å
−
=
+
−
=
wN
ji
ww
jijidjijid
wN
D
0,
2
2
,,,
ˆ
,
ˆ
,,
)(2
1
7 CCCC
åå
==
=
K
k
R
r
k
r
d
K
D
11
1
8
,
å
−
=
−
−=
1
2
1,
22
ˆ
2
1
2
1
r
ji
ijij
rr
r
ggd
Correlation Based Measures
() ()
åå å
=
−
=
−
=
÷
÷
÷
ö
ç
ç
ç
è
æ
=
K
k
N
ji
N
ji
kk
jiCjiC
K
C
1
1
0,
1
0,
22
,
ˆ
/,
1
1
()() ()
ååå
=
−
=
−
=
÷
÷
÷
ö
ç
ç
ç
è
æ
=
K
k
N
ji
k
N
ji
kk
jiCjiCjiC
K
C
1
1
0,
2
1
0,
,/,
ˆ
,
1
2
() ()
[]
÷
÷
÷
ö
ç
ç
ç
è
æ
÷
÷
÷
ö
ç
ç
ç
è
æ
−−=
åå
=
−
=
K
k
N
ji
kkk
jiCjiC
K
C
1
1
0,
2
2
/,
ˆ
,
1
13
σ
||
N
1
4C
1N
0j,i
ij
2
å
−
=
θ
Θ=µ≡ ,
),(
ˆ
),(
),(
ˆ
),,(
cos
1
jiji
jiji
ij
CC
CC
−
=Θ
()
2/1
2
1N
0ji,
ij
2
1
5
ù
ê
ê
ê
ë
é
−Θ=
å
−
=
θ
µ
N
C
Edge Based Measures
{}
å
=
+
=
d
n
i
i
td
ad
nn
E
1
2
1
1
,max
1
1
() ()
()
å
−
=
−=
1
0,
2
2
,
ˆ
,
1
2
N
ji
jiQjiQ
N
E ,
() ()
()
2
1
1
0,
2
,
ˆ
,
11
3
åå
=
−
=
−=
K
k
N
ji
kk
jiSjiS
N
K
E
Spectrum Based Measures
() ()
2
1
0,
2
,
ˆ
,
1
1
å
−
=
−=
N
vu
vuMvuM
N
S
()()
2
1
0,
2
,
ˆ
,
1
2
å
−
=
−=
N
vu
vuvu
N
S
ϕϕ
()() ( ) () ()
ç
ç
ç
è
æ
−−+−=
åå
−
=
−
=
2
1
0,
2
1
0,
2
,
ˆ
,1,
ˆ
,
1
3
N
vu
N
vu
vuMvuMvuvu
N
S
λϕϕλ
l
l
M
JmedianS =4
l
l
φ
JmedianS =5
l
l
JmedianS =6
Context Based Measures
()()
() ()
2121
ˆˆˆˆ
1 pRpRppDppDX −=−=
ò
−=
λ
dppX
ˆ
2
1
2
()
ò
−=
λ
dppX
2
ˆ
2
1
3
ò
−=
λ
dppX
r
rr
/1/1
ˆ
4
1≥
r
HVS Based Measures
(){} ()
{}
(){}
ååå
=
−
=
−
=
÷
÷
÷
ö
ç
ç
ç
è
æ
−=
K
k
N
ji
k
N
ji
kk
jiCUjiCUjiCU
K
HV
1
1
0,
1
0,
,/,
ˆ
,
1
1
(){} ()
{}
[]
(){}
[]
åå å
=
−
=
−
=
ö
ç
ç
ç
è
æ
−=
K
k
N
ji
N
ji
kkk
jiCUjiCUjiCU
K
HV
1
1
0,
1
0,
2
2
,/,
ˆ
,
1
2
(){} ()
{}
2/1
1
1
0,
2
2
,
ˆ
,
11
3
åå
=
−
=
ï
ï
ý
ü
ï
î
ï
í
ì
−=
K
k
N
ji
kk
jiCUjiCU
N
K
H
å
=
=
102
1
4
i
ii
dH
ω