ArticlePDF Available

Video Error Concealment Using Spatio-Temporal Boundary Matching and Partial Differential Equation

Authors:
  • Origin Wireless, Inc.

Abstract and Figures

Error concealment techniques are very important for video communication since compressed video sequences may be corrupted or lost when transmitted over error-prone networks. In this paper, we propose a novel two-stage error concealment scheme for erroneously received video sequences. In the first stage, we propose a novel spatio-temporal boundary matching algorithm (STBMA) to reconstruct the lost motion vectors (MV). A well defined cost function is introduced which exploits both spatial and temporal smoothness properties of video signals. By minimizing the cost function, the MV of each lost macroblock (MB) is recovered and the corresponding reference MB in the reference frame is obtained using this MV. In the second stage, instead of directly copying the reference MB as the final recovered pixel values, we use a novel partial differential equation (PDE) based algorithm to refine the reconstruction. We minimize, in a weighted manner, the difference between the gradient field of the reconstructed MB in current frame and that of the reference MB in the reference frame under given boundary condition. A weighting factor is used to control the regulation level according to the local blockiness degree. With this algorithm, the annoying blocking artifacts are effectively reduced while the structures of the reference MB are well preserved. Compared with the error concealment feature implemented in the H.264 reference software, our algorithm is able to achieve significantly higher PSNR as well as better visual quality.
Content may be subject to copyright.
2IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Video Error Concealment Using Spatio-Temporal
Boundary Matching and Partial Differential Equation
Yan Chen, Student Member, IEEE, Yang Hu, Oscar C. Au, Senior Member, IEEE, Houqiang Li, and
Chang Wen Chen, Fellow, IEEE
Abstract—Error concealment techniques are very important for
video communication since compressed video sequences may be
corrupted or lost when transmitted over error-prone networks. In
this paper, we propose a novel two-stage error concealment scheme
for erroneously received video sequences. In the first stage, we
propose a novel spatio-temporal boundary matching algorithm
(STBMA) to reconstruct the lost motion vectors (MV). A well
defined cost function is introduced which exploits both spatial and
temporal smoothness properties of video signals. By minimizing the
cost function, the MV of each lost macroblock (MB) is recovered
and the corresponding reference MB in the reference frame is
obtained using this MV. In the second stage, instead of directly
copying the reference MB as the final recovered pixel values, we
use a novel partial differential equation (PDE) based algorithm to
refine the reconstruction. We minimize, in a weighted manner, the
difference between the gradient field of the reconstructed MB in
current frame and that of the reference MB in the reference frame
under given boundary condition. A weighting factor is used to
control the regulation level according to the local blockiness degree.
With this algorithm, the annoying blocking artifacts are effectively
reduced while the structures of the referenceMB are well preserved.
Compared with the error concealment feature implemented in
the H.264 reference software, our algorithm is able to achieve
significantly higher PSNR as well as better visual quality.
Index Terms—Error concealment, H.264, motion compensation,
partial differential equation.
I. INTRODUCTION
WITH the explosive growth of the Internet and the wire-
less network, video services over these networks are
becoming more and more popular. However, these band-lim-
ited and error-prone channels are unreliable for transmission
of video signals, especially for compressed video transmis-
sion. Although the latest video coding standards such as
Manuscript received September 24, 2006; revised August 8, 2007. The work
of Y. Chen and O. Au was supported in part by the Innovation and Technology
Commission of the Hong Kong Special Administrative Region, China under
Project GHP/033/05. The work of Y. Hu and H. Li was supported by NSFC Gen-
eral Program under Contract 60572067, NSFC General Program under Contract
60672161, and 863 Program under Contract 2006AA01Z317. The associate ed-
itor coordinating the review of this manuscript and approving it for publication
was Dr. Wenjun (Kevin) Zeng.
Y. Chen and O. C. Au are with the Department of Electronic and Com-
puter Engineering, Hong Kong University of Science and Technology, Kowloon,
Hong Kong, China (e-mail: eecyan@ust.hk; eeau@ust.hk).
Y. Hu and H. Li are with the Department of Electronic Engineering and Infor-
mation Science, University of Science and Technology of China, Hefei 230026,
China (e-mail: yanghu@ustc.edu; lihq@ustc.edu).
C. W. Chen is with the Department of Electrical and Computer Engineering,
Florida Institute of Technology, Melbourne, FL 32901 USA (e-mail:cchen@fit.
edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMM.2007.911223
H.261/263/264 and MPEG 1/2/4 can achieve good compres-
sion performance, they also make the compressed video signals
extremely vulnerable to transmission errors. One packet loss
or even one bit error can make the whole slice undecodeable,
which would severely degrade the visual quality of the received
video sequences. A wide range of techniques have been devel-
oped to tackle this problem. Compared with other mechanisms
such as forward error correction (FEC) scheme and automatic
retransmission request (ARQ), error concealment has the ad-
vantages of neither consuming extra bandwidth as in FEC nor
introducing retransmission delay as in ARQ.
Many existing error concealment techniques have made use
of the inherent correlation among spatially and/or temporally
adjacent data to alleviate the influence of the decoding errors.
Spatial approaches exploit the correlation between neighboring
pixels in the same frame. They interpolate the lost coefficients
from the spatially adjacent data. Temporal approaches, on the
other hand, restore the missing area by exploiting temporal corre-
lation between neighboring frames. An important issue with this
approach is to recover the motion information of the lost blocks.
As a result, a large amount of research has focused on recovery
of motion vectors (MV). In [1], Haskell and Messerschmitt pre-
sented some simple methods for lost MV recovery. They took
zero MV, the MV of the collocated block in the reference frame,
and the average or the median of the MVs from the spatially
adjacent blocks as the candidate MVs for the lost blocks. The
well known Boundary Matching Algorithm (BMA) proposed
in [2] recovered the lost MV from the candidate MVs which
minimized the total variation between the internal boundary and
the external boundary of the reconstructed block. A variation of
this approach has been adopted in the H.26L (the early version
of H.264) test model and was described in detail in [3]. Some
more sophisticated approaches have also been proposed to better
estimate the lost MVs. For example, Zheng et al. [4] proposed
to recover the lost MVs by using Lagrange interpolation for-
mula while Lie and Gao proposed to find the lost MVs by jointly
optimizing the boundary distortion of the whole slice through
dynamic programming. In order to reduce the complexity, they
adopted a suboptimal alternative enhanced with an adaptive
Kalman-filtering algorithm [5], [6]. More recently, hybrid algo-
rithms have been studied that explored both spatial and temporal
correlations to obtain better recovery of the lost data. In [7], Chen
et al.. proposed a priority-driven region matching algorithm to
exploit the spatial and temporal information. The lost area was
recovered region-by-regionand a priority term is defined to deter-
mine the restoration order. Atzori et al. proposed a concealment
scheme [8] which first replaced the lost block using BMA, and
then applied a mesh-based warping procedure to better match
the block content with the correctly received surrounding areas.
1520-9210/$25.00 © 2008 IEEE
CHEN et al.: VIDEO ERROR CONCEALMENT 3
The aforementioned MV recovery algorithms try to recover
the lost MVs from candidates by enforcing the spatial and/or
temporal smoothness property of the image/video signals. How-
ever, they fail to avoid introducing visible blocking artifacts
in the recovered area, especially under the circumstances of
sudden scene changes as well as fast and complex movement.
Moreover, since transport prioritization has been increasingly
adopted in layered coding, which would transmit the MVs and
other important data with more protection, the MVs may be cor-
rectly received even when the motion compensated residue are
lost. For example, in the data partitioned slice of the emerging
H.264 standard, the coded data is placed in three separate Data
Partitions (A, B, and C), each of which contains a subset of the
coded slice. The MVs are contained in Partition A which could
be given higher priority during transmission. In this case, in-
stead of the recovery of lost MVs, the critical problem becomes
the recovery of the lost motion compensated residue or the re-
duction of the annoying blocking artifacts.
As far as blocking artifacts, i.e., visible discontinuities at
block boundaries, are concerned, one may readily turn to the
post- processing techniques that have been developed to remove
blocking effect due to low bit rate video encoding. This type of
artifacts is visually quite similar to the blocking effects caused
by imperfect lost data reconstruction. Several approaches have
been proposed to alleviate such artifacts, most of which are based
on low pass ltering, AC prediction, projection onto convex
sets (POCS) [9] or more recently, diffusion. As the Gaussian (or
low pass) lter failed to preserve lines and edges, Perona and
Malik [10] proposed to use anisotropic diffusion as an alternative
scheme. The anisotropic diffusion scheme was implemented
via a partial differential equation (PDE) and can successfully
preserve the structural information. Yang and Hu [11] applied
biased anisotropic diffusion scheme to remove the blocking
effect. Although they claimed to unify the artifacts removal and
lost block concealment in one framework and would process
them at the same time, their concealment method was exactly
the same as the maximally smooth recovery method proposed
in [12], which would give blurred recovered block as has been
pointed out in [13]. More recently, Gothandaraman et al.. [14]
proposed to use the method of total variation as an alternative
to biased anisotropic diffusion, and Alter et al.. [15] presented
a deblocking algorithm with weighted total variation later. In
all these schemes, the deblocking problem has been formulated
as an energy minimization problem in which the gradient of
the recovered block, either in the weighted L2-norm (as in
anisotropic diffusion) or in the weighted L1-norm (as in total
variation), would be minimized. Due to the minimization of the
gradient of the recovered block, these methods would produce an
unsatisfactory, blurred interpolation. In a recent work that dealt
with the image editing tasks, Perez et al. proposed a guided in-
terpolation mechanism [16]. Instead of minimizing the gradient
of the unknown function, they introduced a guidance eld and
minimized the difference between the gradient of the unknown
function and the guidance eld. This mechanism successfully
overcame the blurring problem while ensuring the compliance
of the lled-in image and the surrounding background.
In this paper, we propose a novel two-stage error conceal-
ment scheme for video signals which are compressed in slice
mode with some slices lost during transmission on error-prone
channels.1In the rst stage, we propose a novel MV recovery
algorithm, spatio-temporal boundary matching algorithm
(STBMA), to recover the lost MV for each macroblock (MB)
in the lost slices. It works by minimizing a distortion function
which exploits both spatial and temporal smoothness properties
of the video signals. With the recovered MV, we could nd the
reference MB in the reference frame for each lost MB. Inspired
by the work in [16], in the second stage, instead of replacing
the lost MB with the corresponding reference MB as most
previous error concealment schemes have done, we propose a
novel PDE-based algorithm to rene the reconstruction. The
proposed PDE-based algorithm could effectively reduce the
blocking artifacts, and meanwhile well preserve the structure
of the reference MB. It works by minimizing, in a weighted
manner, the difference between the gradient eld of the recon-
structed MB in current frame and that of the reference MB
in the reference frame under given boundary condition. The
weighting factor produces an anisotropic regulation scheme
which determines the level of regulation according to the de-
gree of local blockiness. Both spatial and temporal correlations
are well-exploited in the proposed scheme. The experimental
results show that the proposed two-stage error concealment
scheme is able to achieve not only higher PSNR but also better
visual quality when compared with the error concealment
feature implemented in the H.264 reference software.
The rest of this paper is organized as follows. We describe the
proposed algorithm in details in Section II. Then, we present the
experimental results in Section III to verify the performance of
the proposed scheme. We conclude this paper in Section IV with
a summary of our algorithm.
II. PROPOSED ERROR CONCEALMENT SCHEME
In this section, we describe the proposed spatio-temporal
error concealment scheme in details. We rst introduce the
spatio-temporal boundary matching algorithm (STBMA) for
MV recovery. Then we present the PDE-based algorithm for
lost block reconstruction.
Due to the correlation among adjacent video signals in both
spatial and temporal domain, a reasonable criteria for choosing
good candidate MV is to examine whether the MV can preserve
spatial and temporal continuities of the signals. Motivated by
this intuition, we introduce a novel boundary matching distor-
tion function, in which both spatial and temporal smoothness
properties are well exploited. The MV of each lost MB is recov-
ered through minimizing the distortion function. And the corre-
sponding MB indicated by the recovered MV in the reference
frame is used as the reference MB for the lost MB.
Most previous works recover the lost MB by simply copying
the corresponding reference MB from the reference frame.
However, in this case, the boundaries of the reconstructed
MB are usually not compatible with the spatially surrounding
pixels. Therefore, instead of directly copying the reference
MB, we rst compute the gradient eld of the reference MB
in the reference frame and then rene the reconstruction by
minimizing the difference between the gradient eld of the
reconstructed MB and that of the reference MB.
1This paper is an extension of our previous work [17] and [18].
4IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Like other existing schemes, we also assume that the erroneous
MBs have been detected and a MB-based status map of a frame is
available to specify the position of the lostMBs. According to
the status map, all correctly received MBs are decoded rst and
then the lost MBs are concealed using the proposed algorithm.
In the following, we only consider scalar image functions, since
the concealment problem can be solved in the same way for each
color component separately,i.e., the Y, U, V components of video
signals.
A. Motion Vector Recovery and Motion Compensation
1) Motion Compensation Using Correct MVs: In H.264, if
the data partitioned slice is adopted and the partition with im-
portant data is transmitted with higher priority, the motion vec-
tors may be correctly decoded although the motion compensated
residues are lost. In this case, the reference MB can be easily
located using the correctly decoded MVs according to the fol-
lowing equation:
(1)
where stands for the reference value for the pixel at
location in frame , while is the reference
frame. is the correctly decoded motion vector.
2) Spatio-TemporalBoundaryMatchingAlgorithm (STBMA):
IftheMVs are lost togetherwiththemotioncompensatedresidues,
MV recovery algorithms should be employed. In H.264 reference
software, the classic boundary matching algorithm (BMA) is uti-
lizedtorecover thelostMVfromthe candidate MV set,whichmin-
imizes the side match distortion between the internal and ex-
ternal boundaries of the reconstructed MB [3]. Here, as shown in
Fig. 1, internal boundaries stand for the boundary pixels of the
MB while external boundaries stand for the surrounding pixels
in the corresponding spatially neighboring MBs. is dened
as the sum of absolute differences between the internal bound-
aries of the candidate block in the reference frame and the ex-
ternal boundaries of the lost block in current frame
(2)
where stands for current frame and is the
corresponding reference frame, the subscripts are
Fig. 1. Illustration of the boundary matching relationship.
short for North, South, West, and East, respectively, as shown
in Fig. 1, M is the size of MB (e.g., in H.264), is
the location of top-left pixel in current lost block,
is the candidate MV which could be zero MV or the MVs of
neighboring adjacent blocks. if the north neighboring
MB in current frame is available, otherwise . So are the
denitions of and . The winning reconstructed MV
is the one which minimizes .
From (2), we can see that BMA utilizes the smoothness prop-
erty between adjacent pixels to recover the lost MV. However,
since only the spatial smoothness property is considered, it may
not be able to select out the best one from the candidate MVs. In
this paper, we present a more general side match distortion func-
tion which considers both spatial and temporal smoothness
properties of the video signals. is dened as a weighted
average of two terms: temporal side match distortion
and spatial side match distortion .
(3)
where the weighting factor is a real number between 0 and 1;
and are dened as follows.
The temporal term is utilized to measure how well
the candidate MV can keep temporal continuity. We observe
that the neighbors of current MB are similar to the neighbors of
the reference MB in the reference frame. Therefore, we dene
as the average difference between the external bound-
aries of the candidate reference block in the reference frame and
those of the lost block in current frame
CHEN et al.: VIDEO ERROR CONCEALMENT 5
(4)
With this denition, a good candidate MV should give a small
.
According to the spatial smoothness property of video sig-
nals, the structures in the lost MBs should be compatible to those
of the available spatially neighboring MBs. Therefore, recov-
ering the lost MB using a good candidate MV in some sense
means introducing few structural mismatches at the boundaries.
Here, is utilized to choose such a good MV from can-
didate MVs. We dene as the average changes of the
Laplacian estimator along the tangent direction, which measures
the continuity of the isophotes at the boundaries, as shown in (5).
With such a denition, a good candidate MV should give a small
. A similar term is utilized to generate the updating in-
formation for iterative diffusion in the task of image inpainting
[19]. Here, we use it as part of the cost function to select the best
MV from the candidate MV set
(5)
where the symbols M, , and
have the same meanings as those dened in
(1) (see Fig. 1), is the gradient
operator, is the normal oper-
ator whose direction is orthogonal to the gradient direction, and
is the Laplacian operator.
In a typical (and our) implementation, these relevant operators
can be calculated as follows:
(6)
Since current block is totally lost, when computing ,
we rst use the candidate reference block to replace current lost
block. In (5), stands for the normalized
gradient of the Laplacian estimator, and
is the normalized vector along the tangent direction. If the
structures across the boundaries are perfectly matched, the two
terms should be orthogonal to each other and the inner product
should be zero. However, if there are some mismatches, the
absolute value of the inner product of the two terms tends
to be large, which would make large. Besides, we
multiply the inner product by the gradient magnitude for
every pixel in (5). There are two reasons for doing this: rstly,
through multiplying , the range of (notice that
without multiplying ) would tend to
be compatible to that of . Secondly, for the pixels at
the internal boundaries as shown in Fig. 1 , stands for
the brightness change across the boundaries, which reects the
blockiness degree to some extent. According to our observation,
severe blockiness degree tends to have large while slight
blockiness degree usually has small . Therefore, if is
small, even if the absolute value of the inner product is large, it
is still possible that the reference block is a good candidate. On
the contrary, if is large, even if the absolute value of the
inner produce is small, there is still a chance that the reference
block is a bad candidate. So, it is reasonable and necessary to
consider the term of .
In Figs. 2 and 3, we show two examples to demonstrate the
characteristics of : one is a synthesis image (Fig. 2) and
the other is a sub-image cut from foremansequence (Fig. 3).
Due to space limitation, it is difcult to illustrate the whole
MB. And we are only interested in on the boundary of
the MB. In Figs. 2 and 3, we use a small part of the lost MB and
its neighbor MB to explain the effect of . We assume
that the upper 4 8 pixels are from the correctly received MB
and the bottom 4 8 pixels are from the lost MB. For each
example, there are three candidate reconstructions, as shown
in (a-1), (b-1), and (c-1). Obviously (a-1) is the best choice
with perfect structure matching while (b-1) and (c-1) have
different extent of mismatching at the boundary. We will now
show that can automatically select (a-1) as the best
6IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Fig. 2. Example of
D
. (Synthesis image with the upper 4
2
8 pixels from the correctly received MB and the bottom 4
2
8 pixels from the lost MB, the thick
black solid line is the boundary): (a-1)(b-1)(c-1) three candidate reconstructions with perfect structure matching, close structure mismatching, and far structure
mismatching, respectively; (a-2)(b-2)(c-2) the vector eld
(
r
f
)
=
(
jr
f
j
)
2jr
f
j
of the three candidates; (a-3)(b-3)(c-3) the vector eld
(
r
(
4
f
))
=
(
jr
(
4
f
)
j
)
of the three candidates; (a-4)(b-4)(c-4) displaying the vector elds in the second and third rows together (
D
of (a-1)(b-1)(c-1) are 0, 28.466, and 20.436,
respectively).
candidate for both examples. In Figs. 2 and 3, (a-2)(b-2)(c-2)
illustrate the vector elds of the
corresponding candidates, (a-3)(b-3)(c-3) represent the vector
elds . We put the vector elds of
and together
in (a-4)(b-4)(c-4) to see their inner product. As shown in
CHEN et al.: VIDEO ERROR CONCEALMENT 7
Fig. 3. Example of
D
. (True image with the upper 4
2
8 pixels from the correctly received MB and the bottom 4
2
8 pixels from the lost MB.): (a-1)(b-
1)(c-1) three candidate reconstructions with perfect structure matching, close structure mismatching, and far structure mismatching, respectively; (a-2)(b-2)(c-2)
the vector eld
(
r
f
)
=
(
jr
f
j
)
2jr
f
j
of the three candidates; (a-3)(b-3)(c-3) the vector eld
(
r
(
4
f
))
=
(
jr
(
4
f
)
j
)
of the three candidates; (a-4)(b-4)(c-4)
displaying the vector elds in the second and third rows together (
D
of (a-1)(b-1)(c-1) are 5.5166, 12.51, and 20.283, respectively).
8IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
(5), we only care about the inner product of the pixels at the
internal boundary of the lost MB. Therefore, we only focus on
the vectors in the black rectangles in (a-4), (b-4), and (c-4).
For the rst candidate of the synthesis image, as shown in
Fig. 2(a-4), since is either orthogonal
to or equals zero, the inner product is
zero, which leads to a zero . However, for the second
and third candidates of the synthesis image, as shown in
Figs. 2(b-4) and (c-4), is not orthog-
onal to at some points on the boundary,
which results in a nonzero inner product. And their
are 28.466 and 20.436, respectively. As we mentioned above, a
better candidate should produce smaller . Therefore, for
the synthesis image, the rst candidate would be selected as ex-
pected. Similar to the synthesis image, for the three candidates
of the true image shown in Figs. 3(a-1), (b-1) and (c-1), we can
see that the inner product of the rst candidate is smaller than
the other two candidates ( for these three candidates are
5.5166, 12.51, and 20.283, respectively). Therefore, Fig. 3(a-1)
would be selected as the best choice for the true image.
The winning MV is the candidate MV, which could be zero
MV or the MVs of the neighboring adjacent blocks, that mini-
mizes . The desirable reference MB in the reference frame
is obtained using this MV.
B. Refining Reconstruction Using Partial Differential
Equation (PDE)
After nding the reference MB in the reference frame, a
straightforward method to reconstruct the lost MB is to directly
copy the pixels from the corresponding reference MB. How-
ever, the reference MB produced by the winning MV is optimal
only in that its cost, which is a measure of the smoothness,
is smaller than those produced by other candidate MVs. It
does not inherently ensure the perfect matching between the
recovered MB and the surrounding boundaries. Therefore,
visible blocking artifacts may still exist in the restored images.
The discontinuity comes partly from the absence of the motion
compensated residue, also called displaced frame difference
(DFD). In [2], Lam et al.. proposed to use the DFD of the
adjacent blocks as substitution for the missing DFD. However,
as the correlation of the DFD among adjacent blocks is not as
high as that of the MV, this method is not quite effective.
Considering the difculty in recovering the DFD, an alter-
native way is to directly improve the match of the copied refer-
ence MB and the surrounding pixels. In order to achieve this ob-
ject, we abandon the traditional way of using the corresponding
pixel values of the reference MB as the nal reconstructed pixel
values for concealment. Instead, we formulate the problem of
recovering the lost MB as an optimization problem.
Before starting the problem formulation, we rst dene some
notations. As illustrated in Fig. 4, let S, a closed subset of ,
be the denition domain of current frame. Let be a closed
subset of S, which represents the lost MB, and let be the
external boundary of consisting of the correctly received sur-
rounding pixels of the lost MB. Let be an unknown scalar
function dened over and . Let be a known scalar func-
tion dened over Sminus . It is the set of correctly decoded
pixel values. With this denition, we assume that there is only
Fig. 4. Illustration of notations.
one lost MB in current frame. This assumption can be relaxed
since only a subset of , which is dened over , will be used
in later computation. Another assumption we make here is that
the surrounding external boundaries of the lost MB is known as
. This assumption is reasonable considering that the coded
MBs could be packetized in an interleaved manner. Even if this
condition is not met, i.e., one or more adjacent MBs of current
damaged MB has been lost, the proposed PDE-based algorithm
can still be applied successfully according to our discussion in
Section II-D. Let be the gradient vector eld of the reference
MB in the reference frame, which is found using the winning
MV obtained in Section II-A.
If the reference frame is correctly received, the boundaries
of the reference MB would be compatible with its surrounding
pixels in the reference frame, i.e., the pixel values change
in a natural manner across the block boundaries. Even if
the reference frame is erroneously received, due to low-pass
lter (deblocking) and post-processing (STBMA+PDE for the
reference frame), it is reasonable for us to assume that the
boundaries of the reference MB would be more compatible
with its surrounding pixels in the reference frame than that in
current frame. Therefore, we would like to push towards
when blocking artifacts is severe at the reconstructed MB.
So, the problem of recovering the lost MB can be formulated as
nding an optimal solution which minimizes the following
objective function:
(7)
According to (7), the recovered should be the function whose
gradient, in the -norm and in a weighted manner, is closest to
the gradient vector eld under given boundary condition. If
the coefcient is set to be constant (e.g., ), it
would return to the isotropic guided interpolation scheme pro-
posed in [16], which minimizes . But this
isotropic method might cause problems (e.g., the bleeding ar-
tifact) while reconstructing the lost MB, as shown in our ex-
periment (the red ellipse region in Fig. 12(c)). Through intro-
ducing this spatially varying coefcient, we could better control
the interpolation process according to the degree of local block-
iness. (7) is also a generalization of the anisotropic diffusion
method [11], [14] considering that the idea of anisotropic diffu-
sion method is a special case in which the vector eld is set
to be zero, i.e., it minimizes . The zero vector
CHEN et al.: VIDEO ERROR CONCEALMENT 9
eld would produce a blurred interpolation. On the contrary, a
well dened nonzero vector eld could better preserve the struc-
ture information while alleviating the annoying blocking effects.
With (7), instead of copying the pixel values of , we only try to
preserve the content structure of , which is depicted by the gra-
dient. Besides, we utilize the continuity at the boundary of in
the reference frame to improve the consistency at the boundary
of the reconstructed block.
The solution that minimizes (7) must satisfy the Euler-La-
grange equation2, according to which we have
(8)
The gradient descent method can be used to solve (8). The so-
lution is the steady state solution of following equation:
(9)
where is the iteration index.
The spatially varying coefcient plays an important
role in the interpolation process. When is large, it tries to
push towards the vector eld while small allows
to deviate from . According to previous analysis, we
would like to push towards at the locations where the
blocking artifacts is severe. Therefore, we try to give a
large value where the degree of the blocking artifacts is obvious
and make small where there is little blockiness.
According to our observation, the absolute difference between
and , i.e., , can somewhat reect the degree of
local blockiness. If is small, there tends to be little
blockiness at the boundaries. In this case, we should set
a small value such that only small mount of regulation is per-
formed. When becomes larger, we nd that there
might be some kinds of discontinuities. Therefore, we should set
a larger value to perform more regulation to reduce the
blocking artifacts. However, when is even larger, we
nd that rather than the blocking artifacts, the discontinuities are
more likely to come from the inherent changes across the edges
of the image, in which case we should make small to
prevent regulation and preserve the original reconstructed value
(e.g., the red ellipse region in Fig. 12(c)).
Following the above descriptions, the spatially varying coef-
cient can be chosen as and
the function should satisfy the following characteristics.
The should be kept small when is small. Then rises
with the increase of . However, when is larger than
should begin to decrease. The larger the distance from to , the
smaller should be. When greatly exceeds should
be set to be extremely small.
There are many possible choices for , in this paper,
is chosen as follows:
(10)
2If
J
is dened by
J
=
F
(
x; y; f; f ;f
)
dxdy
, then
J
has a sta-
tionary value if the Euler-Lagrange Differential Equation,
(
@F
)
=
(
@f
)
0
(
@
)
=
(
@x
)(
@F
)
=
(
@f
)
0
(
@
)
=
(
@y
)(
@F
)
=
(
@f
)=0
, is satised. In
our problem, with the cost function in (7),
F
=
c
(
x; y
)
jr
f
0r
g
j
.
Therefore, we have
0
0
(
@
)
=
(
@x
)[
c
(
x; y
)((
@f
)
=
(
@x
)
0
(
@g
)
=
(
@x
)]
0
(
@
)
=
(
@y
)[
c
(
x;y
)((
@f
)
=
(
@y
)
0
(
@g
)
=
(
@y
))] = 0
, which is equivalent to
r1
[
c
(
x; y
)(
r
f
0r
g
)] = 0
.
Fig. 5. Weighting Factor Function
h
(s)
.
The characteristic of the chosen function is illustrated in
Fig. 5. We can see that it is consistent with what we have stated.
Given the vector eld , the reconstructed function
interpolates the specied boundary condition inwards, while
following, in a weighted manner, the spatial variation of as
closely as possible. In the following two subsections, we will
rst introduce the numerical implementation of (9). Then we
will discuss how to handle the special case that some of the
boundaries of the lost MB are not available.
C. Numerical Scheme
We implement the discrete version of (9) using a simple
scheme, the form of which is quite similar with the algorithm
described by Perona and Malik [10]. It is discretized on the
discrete pixel grid of the digital image
(11)
where in order to ensure the stability of the nu-
merical scheme, as pointed out in [10], the subscripts
are short for North, South, West, and East, respectively, is
the iteration index. The superscript and subscripts on the square
bracket are applied to all the terms enclosed in it. The symbols
and which indicate the nearest-neighbor dif-
ferences are dened as follows:
(12)
The corresponding terms associated with g are dened in the
same way.
The initial condition of (11) is . The nearest-neighbor
differences at the boundaries of are the pixel value variations
between the internal boundaries and external boundaries of g in
the reference frame, while the nearest-neighbor differences at
the boundaries of are the pixel value changes between the in-
ternal boundaries of , which are changing with iteration, and the
10 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
TABLE I
AVERAGE PSNR PERFORMANCE OF THE RECONSTRUCTED SEQUENCES USING DIFFERENT METHODS
surrounding correctly decoded pixel values. Therefore, at time
, the only difference between and are the values at
the boundaries of the MBs which could trigger the iteration.
Since the numerical scheme is implemented in an iterative
form, the weighting factors should be updated at every iteration
as follows:
(13)
The stopping criterions of the iteration process of (11) are de-
ned as
(14)
where and are two pre-dened thresholds.
D. Special Case for the PDE-Based Renement With
Erroneous Boundaries
In the above two subsections, we discussed how to use the
PDE-based algorithm to rene the recovered MB generated by
STBMA based on the assumption that the boundary informa-
tion of the lost MB in all four directions is known. When this
assumption is not true, for example, if some boundaries of the
lost MB are not available, we copy the corresponding bound-
aries in the reference frame using the winning MVs generated
by STBMA. If the boundaries in the reference frame are also
unavailable, the gradients of g and f at these boundaries are all
set to be zero. This is reasonable since if some boundaries of the
lost MB are not available, we do not want to utilize the boundary
information in those directions. By copying the corresponding
boundaries in the reference frame, would be the same as
at those boundaries. In such case, since , the
weighting factor at those positions would be zero and the diffu-
sion process would not be applied in those directions.
III. EXPERIMENTAL RESULTS
Although the proposed method is general and can be applied
to any block-based video compression scheme, H.264 is uti-
lized here to evaluate the proposed algorithm. The JM9.0 refer-
ence software is used in the experiment. We compare the perfor-
mance of the proposed algorithm with the inter-frame conceal-
ment feature implemented in the reference software [3], which
is based on the classical BMA [2]. We have tested the algorithms
on six video sequences: Foreman, Table, Tempete, Paris, News
and Carphone. For the rst ve sequences, both QCIF and CIF
format are tested. For Carphone sequence, we only test the QCIF
format since we do not have the original CIF format sequence.
The test sequences in QCIF(CIF) format are encoded at 10(15)
Hz frame rate. I frames are encoded every fteen frames and
no B frames are used. Slice mode is enabled. No intra mode is
used in P frames. The quantization parameter is set to be 24.
In the reference software, intra frames are concealed spatially
using weighted pixel averaging [3]. However, this algorithm is
quite ineffective and would make the recovered MB extremely
blurring. Considering the annoying error propagation problem
in the prediction coding scheme, the badly concealed MBs in I
frames would greatly degrade the following P frames. In order
to better compare the proposed algorithm with BMA, both of
which mainly aim at the inter frame concealment, we assume
that the transmission errors only occur in P frames. In order
to simulate the transmission errors, a number of slices are ran-
domly dropped in P frames according to the error pattern. In all
the following experiments, the parameters , and
are set to be 0.5, 0.1, 0.5, 4, 40, and 0.01, respectively. And
the maximum iteration time for the diffusion process in the
second stage at 10, 20, 30, 50, 100, 500 are tested.
In the rst experiment, the rst 100 frames of the six se-
quences are encoded using slice mode. We assume that one
slice contains one row of MBs. The packet loss rate at 5% and
10% are tested. For each packet loss rate, we simulate 20 dif-
ferent error patterns and evaluate their average performances.
The slices of P frames are rstly dropped according to error
pattern. Then, the erroneously received P frames are concealed
using BMA [2], STBMA-only and STBMA together with PDE
(in all the following descriptions and gures, STBMA means
STBMA-only, while STBMA+PDE stands for rst obtaining
the reference MB using STBMA and then applying PDE to gen-
erate the nal result). For STBMA+PDE, we have tested the
maximum iteration time at 10, 20, 30, 50, 100, and 500.
Notice that BMA is the method implemented in the H.264 ref-
erence software [3]. Table I shows the PSNR performance of
CHEN et al.: VIDEO ERROR CONCEALMENT 11
TABLE II
AVERAGE SYSTEM TIME USING DIFFERENT METHODS
Fig. 6. Foreman(CIF)sequence PSNR performance comparison versus the
frame number while the slice loss rate is 5% (each slice contains one row of
MBs).
the reconstructed sequences using different methods. We can
see that STBMA always performs better than BMA, while with
STBMA+PDE, PSNR can be further improved. On the average
of all six sequences, STBMA can achieve 0.77 dB gain at 5%
slice loss rate and 0.88 dB PSNR gain at 10% slice loss rate,
respectively, compared with BMA. For some sequences, such
as Tempete(QCIF) and Paris(CIF), STBMA+PDE obtain sim-
ilar PSNR performance with STBMA. However, for other se-
quences, such as Foreman(CIF), with STBMA+PDE, the PSNR
performance can be further improved by 0.15 dB. We can
also see that, for STBMA+PDE, the maximum iteration time
is enough for PDE step to achieve its advantages (e.g.,
improving the subjective visual quality and PSNR performance)
for most of the sequences. In TableII, we examine the consumed
time of decoding the whole sequence and concealing the error
using different methods. We can see that the complexity of the
proposed STBMA is almost the same as BMA, while with PDE
step, the complexity is a litter higher, but still acceptable, espe-
cially when is set to be 10.
The major advantage of the PDE step is to generate better
visual quality. The visual quality of the recovered frames of
Carphone(QCIF)sequence is illustrated in Fig. 9, where
(a) and (b) are the original and damaged frame, respectively.
Fig. 9(c) shows the result obtained by BMA. Fig. 9(d) shows
Fig. 7. Foreman(CIF)sequence PSNR performance comparison versus the
frame number while the slice loss rate is 5% (DispersedFMO mode is used
in the encoder and each slice contain one half of the frame).
Fig. 8. Foreman(CIF)sequence PSNR performance comparison versus the
frame number while the slice loss rate is 5% (DispersedFMO mode is used
in the encoder and each slice contain one half of the frame).
the recovered frame using STBMA. And the result generated
by STBMA PDE is shown in Fig. 9(e). Through Fig. 9(c), (d),
and (e), we can see that, compared with BMA, STBMA can
better reconstruct the lost MBs in the region marked by red
12 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Fig. 9. Subjective quality comparison for the Carphone(QCIF)sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame;
(b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (24.84 dB); (d) concealed using STBMA (24.85 dB); and
(e) concealed using STBMA+PDE (24.88 dB).
Fig. 10. Subjective quality comparison for the Foreman(CIF)sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame; (b)
damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (33.08 dB); (d) concealed using STBMA (34.98 dB); and (e)
concealed using STBMA+PDE (35.10 dB).
ellipse. However, STBMA still introduces some blocking arti-
facts. The PDE step can greatly reduce the blocking artifacts
and generate the best visual quality, which fully demonstrates
the effectiveness of the proposed scheme.
Fig. 6 shows the PSNR performances of the Foreman(CIF)
sequence versus the frame number. We can see that
STBMA performs signicantly better than BMA, while
with STBMA+PDE, the PSNR performance can be further
improved. In Fig. 10, we examine the visual quality of the
results obtained using different methods. Fig. 10(a) and (b)
show the original and damaged frames, and (c)(e) show the
results generated by BMA, STBMA, and STBMA PDE,
CHEN et al.: VIDEO ERROR CONCEALMENT 13
Fig. 11. Subjective quality comparison for the Foreman(CIF)sequence at 5% slice loss rate (DispersedFMO mode is used in the encoder and each slice
contain one half of the frame). (a) Original frame; (b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (32.43
dB); (d) concealed using STBMA (32.85 dB); and (e) concealed using STBMA+PDE (33.76 dB).
Fig. 12. Subjective quality comparison between isotropic version and anisotropic version of the proposed algorithm: (a) damaged frame with randomly dropped
MBs; (b) concealed using the method proposed in [10]; (c) concealed using the isotropic version of the proposed algorithm
(
c
(
x;y
)=1)
; and (d) concealed using
the anisotropic version of the proposed algorithm
(
c
(
x; y
)=
h
(
jr
g
0r
f
j
))
.
respectively. We can see that STBMA PDE can obtain the
best visual quality with fewest blocking artifacts, especially in
the red ellipse regions. We should notice that the artifacts are
not only introduced from the concealment error of the current
frame, but also from the propagation error of previous frames
due to motion compensation.
We also examine the algorithms when exible macroblock
order (FMO) is adopted. In this paper, we use DispersedFMO
mode, in which even and odd MBs are encoded in different
slices. Foreman(CIF)sequence is used in this experiment. The
slice loss rate is assumed to be 5%. The PSNR performances
are shown in Fig. 7. We can see that the proposed algorithm can
obtain signicant PSNR gain compared with BMA. The visual
quality is exemplied in Fig. 11. It is obvious that both BMA and
STBMA introduce server blocking artifacts. However, the PDE
step can dramatically reduce the blocking artifact and achieve
pleasant visual quality.
In the third experiment, we assume that data partitioned tech-
nique is adopted and motion vectors are correctly received at
the decoder. Therefore, it is possible for us to use the correctly
received MVs to recover the lost MBs. Here, we use the MVs
of the top-left blocks as the MVs of the whole MBs. We im-
plement two methods: one is motion compensation using the
correct MVs and taking the pixel values of the reference MB
as the recovered values for the lost data; the other is the pro-
posed method, in which we rst use the correct MVs to gen-
erate the reference MB, then perform the PDE-based algorithm.
As shown in Fig. 8, our proposed algorithm can achieve better
PSNR performance.
In the fourth experiment, we evaluate the effect of the
weighting factor adopted in the proposed algorithm. As we
mentioned before, if we set all the weights to be 1 ,
the proposed method becomes the isotropic version and is sim-
ilar to the method proposed in [16]. Fig. 12(c) and (d) illustrate
14 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Fig. 13. Subjective quality comparison between BMA and STBMA with full search algorithm. (a) Damaged frame with randomly dropped slices; (b) concealed
using BMA with full search; and (c) concealed using the proposed STBMA with full search.
the results generated by the isotropic and anisotropic version of
the proposed algorithm, respectively. We can see that isotropic
version causes bleedingartifacts at the boundary of the white
hat, as shown in the red ellipse region in Fig. 13(c). This is
partly because the corresponding gradient in the vector eld
is small there while the actual gradient there should be
slightly larger. In this case, the difference between and
is larger than the threshold . In our anisotropic version, the
weighting factor is kept small, so the initial gradient of is
preserved. However, in the isotropic version, is as large as at
other positions and the pixel values which locates at the edge
of the white hat bleed into the reconstructed MB.
We also evaluate the advantage of using the guided gradient
eld . We implement the algorithm proposed in [10], which
is similar to our proposed algorithm but with . As shown
in Fig. 12(b), without the guided eld, the reconstructed frame
becomes cartoon-like image. Notice again, the artifacts not only
come from the restoration of the lost MBs in current frame, but
also come from the propagation error of previous frames due to
motion compensation.
In the last experiment, we examine the potential capability of
the proposed STBMA to reconstruct a high continuity for edges
or lines crossing a lost and a correctly received slice. Due to the
complexity consideration, in all the experiments above, the can-
didate MVs we consider in STBMA only include zero MV and
MVs of the neighboring adjacent MBs. In this experiment, we try
full search algorithm for STBMA with the search range 64 64.
The results are shown in Fig. 13. We can see that with the full
search algorithm, the proposed STBMA algorithm can generate
comparable results to Lie and Gaos method [6, Fig. 5(e)].
IV. CONCLUSION
In this paper, we have developed a novel two-stage error con-
cealment scheme for compressed video sequences which are
corrupted during transmission. In the rst stage, we propose a
novel spatio-temporal boundary matching algorithm (STBMA)
to recover the MVs for the lost MBs. By using the recovered
MVs, we nd the reference MB in the reference frame for
each lost MB. Then, in the second stage, we use a PDE-based
algorithm to rene the reconstruction of the lost pixels. It works
by minimizing the weighted difference between the gradient
of the lost MB and that of the reference MB obtained in the
rst stage under given boundary condition. A well-chosen
weighting factor is used to control the regulation level ac-
cording to the local blockiness degree. The simulation results
fully demonstrate the superiority of the proposed algorithm over
the inter-frame concealment feature implemented in the H.264
reference software which is based on the traditional BMA. The
proposed scheme can effectively reduce the blocking artifact
and well-preserve the inner structure of the recovered MBs. It
can also prevent the undesirable bleeding effect introduced by
the isotropic scheme.
REFERENCES
[1] P. Haskell and D. Messerschmitt, Resynchronization of motion com-
pensated video affected by ATM cell loss,in Proc. ICASSP-92: 1992
IEEE Int. Conf. Acoustics, Speech, Signal Process., San Francisco, CA,
1992, vol. 3, pp. 545548.
[2] W. M. Lam, A. R. Reibman, and B. Liu, Recovery of lost or erro-
neously received motion vectors,in Proc. IEEE Int. Conf. Acoustics,
Speech, Signal Process., 1993, vol. 3, pp. 417420.
[3] Y. K. Wang, M. M. Hannuksela, V. Varsa, A. Hourunranta, and M.
Gabbouj, The error concealment feature in the H.26L test model,in
Proc. IEEE Int. Conf. Image Process., 2002, pp. 729732.
[4] J. H. Zheng and L. P. Chau, A temporal error concealment algorithm
for H.264 using Lagrange interpolation,in Proc. IEEE Int. Symp. Cir-
cuits Syst., 2004, pp. 133136.
[5] Z. W. Gao and W. N. Lie, Video error concealment by using Kalman-
ltering technique,in Proc. IEEE Int. Symp.Circuits Syst., 2004, pp.
6972.
[6] W. N. Lie and Z. W. Gao, Video error concealment by intergrating
greedy suboptimization and Kalman ltering techniques,IEEE Trans.
Circuits Syst. Video Technol., vol. 16, pp. 982992, 2006.
[7] Y. Chen, X. Sun, F. Wu, Z. Liu, and S. Li, Spatio-temporal video error
concealment using priority-ranked region-matching,in Proc. IEEE
Int. Conf. Image Process., 2005, pp. 10501053.
[8] L. Atzori, F. G. B. D. Natale, and C. Perra, A spatio-temporal conceal-
ment technique using boundary matching algorithm and mesh-based
warping (BMA-MBW),IEEE Trans. Multimedia, vol. 3, no. 3, pp.
326338, Sep. 2001.
[9] M. Y. Shen and C. C. J. Kuo, Review of postprocessing techniques for
compression artifact removal,J. Visual Commun. Image Represent.,
pp. 214, 1998.
[10] P. Perona and J. Malik, Scale-space and edge detection using
anisotropic diffusion,IEEE Trans. Pattern Anal. Machine Intell., vol.
12, no. 7, pp. 629639, Jul. 1990.
[11] S. Yang and Y. H. Hu, Coding artifacts removal using biased
anisotropic diffusion,in Proc. IEEE Int. Conf. Image Process., 1997,
pp. 346349.
[12] Y. Wang, Q. F. Zhu, and L. Shaw, Maximally smooth image recovery
in transform coding,IEEE Trans. Image Processing, vol. 41, pp.
15441551, 1993.
[13] S. Shirani, F. Kossentini, and R. Ward, Error concealment methods, a
comparative study,in Proc. Eng. Solutions For Next Millennium, 1999
IEEE Canadian Conf. Electr. Comput. Eng., Edmonton, AB, Canada,
1999, vol. 2, pp. 835840.
CHEN et al.: VIDEO ERROR CONCEALMENT 15
[14] A. Gothandaraman, R. T. Whitaker, and J. Gregor, Total variation for
the removal of blocking effects in DCT based encoding,in Proc. IEEE
Int. Conf. Image Processing, 2001, pp. 455458.
[15] F. Alter, S. Durand, and J. Froment, DCT-based compressed images
with weighted total variation,in Proc. IEEE Int. Conf. Acoustics,
Speech, Signal Processing, 2004, pp. 221224.
[16] P. Perez, M. Gangnet, and A. Blake, Poisson image editing,in Proc.
ACM SIGGRAPH, 2003, pp. 313318.
[17] Y. Chen, O. Au, C.-W. Ho, and J. Zhou, Spatio-temporal boundary
matching algorithm for temporal error concealment,in Proc. IEEE
Int. Symp. Circuits Syst., 2006, pp. 686689.
[18] Y. Hu, Y. Chen, H. Li, and C. W. Chen, An improved spatio-temporal
video error concealment algorithm using partial differential equation,
Proc. SPIE Multimedia Syst. Applicat. VIII, pp. 150160, 2005.
[19] M. Bertalmo, G. Sapiro, V. Caselles, and C. Ballester, Image inpaint-
ing,in Proc. ACM SIGGRAPH, 2000, pp. 417425.
Yan Chen (S06) received the Bachelors degree
from the University of Science and Technology of
China (USTC) in 2004 and the M.Phil. degree from
the Hong Kong University of Science and Tech-
nology (HKUST) in 2007. He is currently pursuing
the Ph.D. degree in the Department of Electrical
and Computer Engineering, University of Maryland,
College Park.
He was an intern in the Internet Media Group,
Microsoft Research Asia (MSRA), from July to
October 2004. His current research interests are in
image/video coding and processing, wireless communication and networking,
and computer vision.
Yang Hu received the Bachelors degree from the
University of Science and Technology of China in
2004. She is currently pursuing the Ph.D. degree in
the Electronic Engineering and Information Science
Department, University of Science and Technology
of China. Since August 2005, she has been a visiting
student with Microsoft Research Asia.
Her current search interests are in multimedia
signal processing, multimedia information retrieval,
machine learning, and pattern recognition.
Oscar C. Au (S87M90SM01) received the
B.A.Sc. degree from the University of Toronto,
Toronto, ON, Canada, in 1986, and the M.A. and
Ph.D. degrees from Princeton University, Princeton,
NJ, in 1988 and 1991, respectively.
After being a Postdoctoral Researcher at Princeton
University for one year, he joined the Department of
Electrical and Electronic Engineering, Hong Kong
University of Science and Technology (HKUST), in
1992. He is now an Associate Professor, Director of
Multimedia Technology Research Center (MTrec),
and Director of the Computer Engineering (CPEG) Program at HKUST. His
main research contributions are on video and image coding and processing,
watermarking and steganography, speech and audio processing. Research
topics include fast motion estimation for MPEG-1/2/4, H.261/3/4 and AVS,
optimal and fast suboptimal rate control, mode decision, transcoding, de-
noising, deinterlacing, post-processing, JPEG/JPEG2000 and halftone image
data hiding, etc. He has published over 130 technical journal and conference
papers. His fast motion estimation algorithms were accepted into the ISO/IEC
14496-7 MPEG-4 international video coding standard and the China AVS-M
standard. He has three U.S. patents and is applying for 20+ more on his signal
processing techniques. He has performed forensic investigation and has stood
as an expert witness in the Hong Kong courts many times.
Dr. Au has been an Associate Editor of the IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEM,PART 1 and the IEEE TRANSACTIONS ON CIRCUITS
AND SYSTEMS FOR VIDEO TECHNOLOGY. He is the Chairman of the Technical
Committee on Multimedia Systems and Applications and a member of the
Technical Committee on Video Signal Processing and Communications and the
Technical Committee on DSP of the IEEE Circuits and Systems Society. He
served on the Steering Committee of IEEE TRANSACTIONS ON MULTIMEDIA
and the IEEE International Conference on Multimedia and Expo (ICME). He
also served/will serve on the organizing committee of the IEEE International
Symposium on Circuits and Systems (ISCAS) in 1997, the IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP) in 2003,
the ISO/IEC MPEG 71st Meeting in 2004, the International Conference on
Image Processing (ICIP) in 2010, and other conferences.
Houqiang Li received the B.S., M.S., and Ph.D.
degrees in 1992, 1997, and 2000, respectively, all
from the Department of Electronic Engineering and
Information Science (EEIS), University of Science
and Technology of China (USTC), Hefei, China.
From 2000 to 2002, he did postdoctoral research in
Signal Detection Lab, USTC. Since 2002, he has been
on the faculty of the Department of EEIS, USTC,
where he is currently an Associate Professor. His
current research interests include image and video
coding, image processing, and computer vision.
Chang Wen Chen (F04) received the B.S. degree
in electrical engineering from University of Science
and Technology of China in 1983, the M.S.E.E. de-
gree from the University of Southern California, Los
Angeles, in 1986, and the Ph.D. degree in electrical
engineering from the University of Illinois at Urbana-
Champaign in 1992.
He has been Allen S. Henry Distinguished Pro-
fessor in the Department of Electrical and Computer
Engineering, Florida Institute of Technology, Mel-
bourne, since July 2003. Previously, he was on the
Faculty of Electrical and Computer Engineering at the University of Missouri-
Columbia from 1996 to 2003, and at the University of Rochester, Rochester,
NY, from 1992 to 1996. From September 2000 to October 2002, he served as
the Head of the Interactive Media Group at the David Sarnoff Research Labo-
ratories, Princeton, NJ. He has also consulted with Kodak Research Labs, Mi-
crosoft Research, Mitsubishi Electric Research Labs, NASA Goddard Space
Flight Center, and Air Force Rome Laboratories.
Dr. Chen has been the Editor-in-Chief for IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY since January 2006. He was
an Associate Editor for IEEE TRANSACTIONS ON MULTIMEDIA from 2002 to
2005 and for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO
TECHNOLOGY from 1997 to 2005. He was also on the Editorial Board of IEEE
Multimedia Magazine from 2003 to 2006 and was an Editor for the Journal
of Visual Communication and Image Representation from 2000 to 2005. He
has been a Guest Editor for the PROCEEDINGS OF THE IEEE (Special Issue on
Distributed Multimedia Communications), a Guest Editor for IEEE JOURNAL
OF SELECTED AREAS IN COMMUNICATIONS (Special Issue on Error-Resilient
Image and Video Transmission), a Guest Editor for IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (Special Issue on Wireless
Video), a Guest Editor for the Journal of Wireless Communication and Mobile
Computing (Special Issue on Multimedia over Mobile IP), a Guest Editor for
Signal Processing: Image Communications (Special Issue on Recent Advances
in Wireless Video), and a Guest Editor for the Journal of Visual Communication
and Image Representation (Special Issue on Visual Communication in the
Ubiquitous Era). He has also served in numerous technical program committees
for numerous IEEE and other international conferences. He was the Chair
of the Technical Program Committee for ICME2006 held in Toronto, ON,
Canada. He was elected an IEEE Fellow for his contributions in digital image
and video processing, analysis, and communications and an SPIE Fellow for
his contributions in electronic imaging and visual communications. He has
received research awards from NSF, NASA, Air Force, Army, DARPA, and the
Whitaker Foundation. He also received the Sigma Xi Excellence in Graduate
Research Mentoring Award from the University of Missouri-Columbia in
2003. Two of his Ph.D. students have received Best Paper Awards in visual
communication and medical imaging, respectively.
... Page Figure 1.1 Relationship between the boundaries of the missing block in the current frame and those of the reference block in the reference frame (Chen et al., 2008). Representation of a standard decoding system (dotted lines) and the joint source channel decoder used in method (Lakovic et al., 1999). ...
... Assuming that a good candidate match is found in the reference frame, its pixel values are then duplicated in the current frame to replace the missing block. Figure 1.1 Relationship between the boundaries of the missing block in the current frame and those of the reference block in the reference frame (Chen et al., 2008). ©2008, IEEE ...
... As temporal correlation exists in natural video contents, the motion vector from the missing part of the frame can be predicted from previous frames. Error concealment algorithms can also combine both spatial and temporal concealment to achieve more accurate results (Atzori et al., 2001;Chen et al., 2008). Traditional error concealment mainly use interpolation from neighboring content to reconstruct the video, while the most recent error concealment solutions use machine learning to recover large missing areas (Sankisa et al., 2018;Kim et al., 2019;Wang et al., 2019). ...
Thesis
Video content transmission constitute the main category of data transmitted in the world nowadays. The quality of the transmitted content is ever increasing, thanks to the deployment of networks able to support huge traffic loads at high speeds, along with strategies to reduce the amount of data necessary to carry video information, based on more efficient video encoders. However, the quality of the video stream perceived by the end user can be greatly degraded by transmission errors. In fact, a packet can either be corrupted or lost during the transmission due to channel impairments, which result in missing video information that must be recovered. Several strategies exist to recover such information. Retransmission of the damaged packet can be performed. However, this option is not always valid under real time constraints as in video streaming, or to avoid increasing the global network load. To recover missing information, error correction methods can be applied at the receiver’s side. In this thesis, we propose error correction methods at the receiver’s side based on the properties of the widely used error detection code Cyclic Redundancy Check (CRC). These methods use the syndrome of a corrupted packet computed at the receiver to produce the exhaustive list of error patterns that could have resulted in such syndrome, containing up to a defined number of errors. We propose different approaches to achieve such error correction. First, we present an arithmetic-based approach which performs logical operations (XORs) on the fly and does not need memory storage to operate. The second approach we propose is an optimized table approach, in which the repetitive computations of the first method are precomputed prior to the communication and stored in efficiently constructed tables. It allows this method to be significantly faster at the cost of memory storage. The error correction validation is performed through a two-step process, which cross-checks the candidate list with another error detection code, the checksum, and then validates the syntax of the encoded packet to test its decodability. We test these new methods with wireless transmission simulations of H.264 and HEVC compressed video content over Wi-Fi 802.11p and Bluetooth Low energy channels. The latter allows the most significant error correction rates and the reconstruction of a near-optimal video even when the channel’s quality starts to decrease.
... Choosing the best replacement of the lost MB considering both OBMA and spatial smoothness has been presented in [59]. The spatial continuity is preserved by minimizing the average changes of the Laplacian estimator along the tangent direction, which measures the continuity of the isophotes at the boundaries. ...
... With the methods of [55], [58] more processing (particle filtering, SOM, respectively) is applied on the MVs. The method presented in [59] applies Laplacian estimator on the boundary pixels. ...
Article
Full-text available
Despite of the recent progresses in reliable and high bandwidth communication, packet loss is still probable and needs special attention in real-time video streaming applications. Congestion and bit error rate, which sometimes are more than the protection capability of the channel codes, are the sources of packet loss in video communication. One common approach to deal with video packet loss is to use error concealment techniques, which estimate the non-received data as close as possible to the actual data. This article reviews the temporal video error concealment methods that have been developed over the past 30 years. The techniques are categorized into 8 groups, and the methods are covered with enough details. The strengths and weaknesses of the 8 groups are also tabulated, and some suggestions for future work and open areas for research are provided.
... The proposed scheme can help protect video content both against unauthorized access and transmission errors while maintaining the video quality similar to that of the original video. Table 5 is a PNSR-based quantitative comparison of the proposed scheme with: stateof-the-art error correction by STBMA [67]; frame copy concealment by JM (JM-FC) [68]; and other recently proposed approaches [69]. Detailed results (see Table 5 column 6) show that the proposed scheme outperformed other techniques over all test videos. ...
Article
Full-text available
Mobile multimedia communication requires considerable resources such as bandwidth and efficiency to support Quality-of-Service (QoS) and user Quality-of-Experience (QoE). To increase the available bandwidth, 5G network designers have incorporated Cognitive Radio (CR), which can adjust communication parameters according to the needs of an application. The transmission errors occur in wireless networks, which, without remedial action, will result in degraded video quality. Secure transmission is also a challenge for such channels. Therefore, this paper’s innovative scheme “VQProtect” focuses on the visual quality protection of compressed videos by detecting and correcting channel errors while at the same time maintaining video end-to-end confidentiality so that the content remains unwatchable. For the purpose, a two-round secure process is implemented on selected syntax elements of the compressed H.264/AVC bitstreams. To uphold the visual quality of data affected by channel errors, a computationally efficient Forward Error Correction (FEC) method using Random Linear Block coding (with complexity of O(k(n−1)) is implemented to correct the erroneous data bits, effectively eliminating the need for retransmission. Errors affecting an average of 7–10% of the video data bits were simulated with the Gilbert–Elliot model when experimental results demonstrated that 90% of the resulting channel errors were observed to be recoverable by correctly inferring the values of erroneous bits. The proposed solution’s effectiveness over selectively encrypted and error-prone video has been validated through a range of Video Quality Assessment (VQA) metrics.
... In [30], however, the neural networks are exploited to track the variations of the MVs in the consecutive frames and reduce the estimation noise. Chen et al. [31] proposed a two-stage EC approach including a spatio-temporal boundary matching algorithm to estimate an MV for the degraded MB and a partial differential equation (PDE) based algorithm. In the first stage, the MV of the degraded MB is estimated exploiting both the temporal and spatial smoothness properties of the video sequences in a weighted manner. ...
Article
Full-text available
In order to enhance the accuracy of the motion vector (MV) estimation and also reduce the error propagation issue during the estimation, in this paper, a new adaptive error concealment (EC) approach is proposed based on the information extracted from the video scene. In this regard, the motion information of the video scene around the degraded MB is first analyzed to estimate the motion type of the degraded MB. If the neighboring MBs possess uniform motion, the degraded MB imitates the behavior of neighboring MBs by choosing the MV of the collocated MB. Otherwise, the lost MV is estimated through the second proposed EC technique (i.e., IOBMA). In the IOBMA, unlike the conventional boundary matching criterion-based EC techniques, not only each boundary distortion is evaluated regarding both the luminance and the chrominance components of the boundary pixels, but also the total boundary distortion corresponding to each candidate MV is calculated as the weighted average of the available boundary distortions. Compared with the state-of-the-art EC techniques, the simulation results indicate the superiority of the proposed EC approach in terms of both the objective and subjective quality assessments.
Article
Full-text available
As the demand of video transmission over communication network has grown rapidly, the data compression and error correction in video processing have shown significant improvement day by day. When the error occurs in a single frame, the visual quality of the subsequent frames gets degraded due to error propagation. Thus, the error control techniques are required for the recovery. Concealment of error at the receiver (decoder) side feats the spatial and temporal characteristics of the frame. Without the requirement of the extra bandwidth and retransmission delay, it enhances the quality of the reconstructed video. However, the output of the error concealment may get affected if the error located before is misleading. Thus error detection also plays an important role while reconstructing the video. However, the output of the error concealment may get affected if the error located before is misleading. This paper proposes error detection and concealment approach for the recovery of lost Macro Block (MB) in video. The spatio-temporal techniques has been used for the error detection followed by the MB type decision applied for classifying the damaged macro block .For the concealment method a new method i.e. Modified Spatio-Temporal Boundary Matching Algorithm (MSTBMA) has been proposed. The proposed work is compared with various existing method for spatial and temporal error concealment. The comparison has been done for various types of error such as block error (single, multiple), burst error and random error generated by the software. Performance is improves in terms of PSNR and visual quality by considering the type of lost MB.
Chapter
Video transmitting over wired or wireless channels such as internet is the area of research because of its fast growth. There are more chances of loss of packets in wireless medium. In existing video recovering methods, either there is a delay as packets are sending it again or redundancy of data, Video Error Concealment (VEC) is the method used for minimizing the errors in the video due to any transmission errors or addition of noise. There are different domains that are used for error concealment such as temporal, spatial, and spatio‐temporal. To achieve error concealment techniques, there are different algorithms such as Boundary Matching Algorithm, Frequency Selective Extrapolation, and Patch Matching. The proposed method is a novel method in the spatio‐temporal domain. It can significantly improve the subjective and objective video quality. Hence, spatio‐temporal algorithm is adopted over other domains. There are many algorithms for VEC. The optimized algorithms should be used for obtaining better quality of videos. Particle Swarm Optimization (PSO) is one of the best optimized bio‐inspired algorithms. This PSO technique can be used to conceal the errors in different formats of videos. Correlation is used for detection of errors in the videos, and each error frame is concealed using PSO algorithm in MATLAB. This was tested for different standard videos and different types and variety of errors for single, multiple, and sequential errors. In comparison to error videos, parameters including PSNR, SSIM, and Entropy improved for concealed videos, while MSE decreased. The results clearly indicate improvement in quality of videos. The errors in the video should be recovered as it is used in many applications such as in internet video streaming, mobile phone, TV, and video conference and in medical areas such as MRI and satellite transmissions.
Article
Cyclic redundancy checks (CRC) are widely used in transmission protocols to detect whether errors have altered a transmitted packet. It has been demonstrated in the literature that CRC can also be used to correct transmission errors. In this paper, we propose an improvement of the state-of-the-art CRC-based error correction method. The proposed approach is designed to significantly increase the error correction capabilities of the previous method, by handling a greater part of error cases through the management of candidate lists and using additional validations. Simulations and results for wireless video communications over 802.11p and Bluetooth Low Energy illustrate the Peak Signal-to-Noise Ratio (PSNR) and visual quality gains achieved with the proposed approach versus the state-of-the-art and traditional approaches. These gains range on average from 1.6 dB to 7.3 dB over Bluetooth Low Energy channels with Eb/No ratio of 10 dB and 8 dB, respectively.
Conference Paper
Full-text available
In this paper, a novel temporal error concealment algorithm, called spatio-temporal boundary matching algorithm (STBMA), is proposed to recover the information lost in the video transmission. Different from the classical boundary matching algorithm (BMA), which just considers the spatial smoothness property, the proposed algorithm introduces a new distortion function to exploit both the spatial and temporal smoothness properties to recover the lost motion vector (MV) from candidates. The new distortion function involves two terms: spatial distortion term and temporal distortion term. Since both the spatial and temporal smoothness properties are involved, the proposed method can better minimize the distortion of the recovered block and recover more accurate MV. The proposed algorithm has been tested on H.264 reference software JM 9.0. The experimental results demonstrate the proposed algorithm can obtain better PSNR performance and visual quality, compared with BMA which is adopted in H.264
Conference Paper
Full-text available
When transmitted over error-prone networks, compressed video sequences may be received with errors. In this paper, we propose a priority-ranked region-matching algorithm to recover the "lost" area of the decoded frames, in which both temporal and spatial correlations of the video sequence are exploited. In the proposed scheme, we first calculate the priorities of all edge pixels of the "lost" area and generate a priority-ranked region group. Then according to their priorities, the regions in the group will search their best matching regions temporally and spatially. Finally, the "lost" area is recovered progressively by the corresponding pixels in the matching regions. Experimental results show that the proposed scheme achieves higher PSNR as well as better video quality in comparison with the method adopted in H.264.
Conference Paper
Full-text available
This paper presents the error concealment (EC) feature implemented by the authors in the test model of the draft ITU-T video coding standard H.26L. The selected EC algorithms are based on weighted pixel value averaging for INTRA. pictures and boundary-matching-based motion vector recovery for INTER pictures. The specific concealment strategy and some special methods, including handling of B-pictures, multiple reference frames and entire frame losses, are described. Both subjective and objective results are given based on simulations under Internet conditions. The feature was adopted and is now included in the latest H.26L reference software TML-9.0.
Article
The application of error concealment in video communication is very important when compressed video sequences are transmitted over error-prone networks and erroneously received. In this paper, we propose a novel error concealment scheme, in which the concealment problem is formulated as minimizing, in a weighted manner, the difference between the gradient of the reconstructed data and a prescribed vector field under given boundary condition. Instead of using the motion compensated block as the final recovered pixel values, we use the gradient of the motion compensated block together with the surrounding correctly decoded pixels of the damaged block to reconstruct the lost data. Both temporal and spatial correlations of the video signals are exploited in the proposed scheme. A well designed weighting factor is used to control the regulation level at a desired direction according to the local blockiness degree at the boundaries of the recovered block. The experimental results show that the proposed algorithm is able to achieve higher PSNR as well as better visual quality in comparison with the error concealment feature implemented in the H.264 reference software. The blocking effects are greatly alleviated while the structural information in the interior of the recovered block is well preserved.
Article
Techniques for resynchronizing motion-compensation-based coders and strategies for the recovery of lost motion vectors are discussed. Leaky-difference resynchronization yields perceptually pleasing video sequences even at fairly high cell loss rates. Future study is needed to determine optimal data-dependent or network-state-dependent conditional resynchronization strategies. Lost motion vectors can be predicted accurately with either the median of intraframe neighboring vectors or the corresponding past-frame vector. The replacement of lost motion vectors with estimates such as these can significantly improve the quality of video affected by cell loss.
Conference Paper
Biased anisotropic diffusion is applied to the coding artifacts removal of the DCT based codec. It is formulated as a cost minimization problem. The weighting factors of the cost function are controlled such that the solution removes the blocking effect and conceals the block losses. It has an advantage over other postprocessing schemes because it handles the discontinuity of the image, smoothes the image selectively, and takes the visual masking in to account. Features needed for the weighting factors are extracted directly from the DCT coefficients to reduce the computational complexity
Article
Low bit rate image/video coding is essential for many visual communication applications. When bit rates become low, most compression algorithms yield visually annoying artifacts that highly degrade the perceptual quality of image and video data. To achieve high bit rate reduction while maintaining the best possible perceptual quality, postprocessing techniques provide one attractive solution. In this paper, we provide a review and analysis of recent developments in postprocessing techniques. Various types of compression artifacts are discussed first. Then, two types of postprocessing algorithms based on image enhancement and restoration principles are reviewed. Finally, current bottlenecks and future research directions in this field are addressed.
Article
Using generic interpolation machinery based on solving Poisson equations, a variety of novel tools are introduced for seamless editing of image regions. The first set of tools permits the seamless importation of both opaque and transparent source image regions into a destination region. The second set is based on similar mathematical ideas and allows the user to modify the appearance of the image seamlessly, within a selected region. These changes can be arranged to affect the texture, the illumination, and the color of objects lying in the region, or to make tileable a rectangular selection.
Conference Paper
This paper addresses the technique of error concealment to recover the video quality at decoders under transmission errors. We propose Kalman filtering as a post-processing technique to traditional boundary matching algorithm (BMA) which estimates the motion vector (MV) of a corrupted macroblock by using boundary pixels of the top and bottom-adjacent MBs as the reference. Due to less information from boundary pixels, MVs estimated by using BMA are mostly inaccurate. Experiments show that by proper mathematical modeling, the Kalman filter is able to filter out the inherit noise so that the recovered MVs lead to a quality improvement of 0.4 dB∼0.72 dB for our test sequences.
Conference Paper
In this paper, we propose an efficient temporal error concealment algorithm for the new coding standard H.264, which makes use of the Lagrange interpolation formula. In H.264, a 16×16 inter macroblock can be divided into different block shapes for motion estimation, and each block has its own motion vector. For nature video, the motion vectors within a small area are correlative. Since the motion vector in H.264 covers a smaller area than previous coding standards, the correlation between neighboring motion vectors increases. We can use the Lagrange interpolation formula to constitute a polynomial that describes the motion tendency of motion vectors, which are next to the lost motion vector, and use this polynomial to recover the lost motion vector. The simulation result shows that our algorithm can efficiently improve the visual quality of corrupted videos.