Content uploaded by Yan Chen
Author content
All content in this area was uploaded by Yan Chen on Feb 03, 2015
Content may be subject to copyright.
2IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Video Error Concealment Using Spatio-Temporal
Boundary Matching and Partial Differential Equation
Yan Chen, Student Member, IEEE, Yang Hu, Oscar C. Au, Senior Member, IEEE, Houqiang Li, and
Chang Wen Chen, Fellow, IEEE
Abstract—Error concealment techniques are very important for
video communication since compressed video sequences may be
corrupted or lost when transmitted over error-prone networks. In
this paper, we propose a novel two-stage error concealment scheme
for erroneously received video sequences. In the first stage, we
propose a novel spatio-temporal boundary matching algorithm
(STBMA) to reconstruct the lost motion vectors (MV). A well
defined cost function is introduced which exploits both spatial and
temporal smoothness properties of video signals. By minimizing the
cost function, the MV of each lost macroblock (MB) is recovered
and the corresponding reference MB in the reference frame is
obtained using this MV. In the second stage, instead of directly
copying the reference MB as the final recovered pixel values, we
use a novel partial differential equation (PDE) based algorithm to
refine the reconstruction. We minimize, in a weighted manner, the
difference between the gradient field of the reconstructed MB in
current frame and that of the reference MB in the reference frame
under given boundary condition. A weighting factor is used to
control the regulation level according to the local blockiness degree.
With this algorithm, the annoying blocking artifacts are effectively
reduced while the structures of the referenceMB are well preserved.
Compared with the error concealment feature implemented in
the H.264 reference software, our algorithm is able to achieve
significantly higher PSNR as well as better visual quality.
Index Terms—Error concealment, H.264, motion compensation,
partial differential equation.
I. INTRODUCTION
WITH the explosive growth of the Internet and the wire-
less network, video services over these networks are
becoming more and more popular. However, these band-lim-
ited and error-prone channels are unreliable for transmission
of video signals, especially for compressed video transmis-
sion. Although the latest video coding standards such as
Manuscript received September 24, 2006; revised August 8, 2007. The work
of Y. Chen and O. Au was supported in part by the Innovation and Technology
Commission of the Hong Kong Special Administrative Region, China under
Project GHP/033/05. The work of Y. Hu and H. Li was supported by NSFC Gen-
eral Program under Contract 60572067, NSFC General Program under Contract
60672161, and 863 Program under Contract 2006AA01Z317. The associate ed-
itor coordinating the review of this manuscript and approving it for publication
was Dr. Wenjun (Kevin) Zeng.
Y. Chen and O. C. Au are with the Department of Electronic and Com-
puter Engineering, Hong Kong University of Science and Technology, Kowloon,
Hong Kong, China (e-mail: eecyan@ust.hk; eeau@ust.hk).
Y. Hu and H. Li are with the Department of Electronic Engineering and Infor-
mation Science, University of Science and Technology of China, Hefei 230026,
China (e-mail: yanghu@ustc.edu; lihq@ustc.edu).
C. W. Chen is with the Department of Electrical and Computer Engineering,
Florida Institute of Technology, Melbourne, FL 32901 USA (e-mail:cchen@fit.
edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMM.2007.911223
H.261/263/264 and MPEG 1/2/4 can achieve good compres-
sion performance, they also make the compressed video signals
extremely vulnerable to transmission errors. One packet loss
or even one bit error can make the whole slice undecodeable,
which would severely degrade the visual quality of the received
video sequences. A wide range of techniques have been devel-
oped to tackle this problem. Compared with other mechanisms
such as forward error correction (FEC) scheme and automatic
retransmission request (ARQ), error concealment has the ad-
vantages of neither consuming extra bandwidth as in FEC nor
introducing retransmission delay as in ARQ.
Many existing error concealment techniques have made use
of the inherent correlation among spatially and/or temporally
adjacent data to alleviate the influence of the decoding errors.
Spatial approaches exploit the correlation between neighboring
pixels in the same frame. They interpolate the lost coefficients
from the spatially adjacent data. Temporal approaches, on the
other hand, restore the missing area by exploiting temporal corre-
lation between neighboring frames. An important issue with this
approach is to recover the motion information of the lost blocks.
As a result, a large amount of research has focused on recovery
of motion vectors (MV). In [1], Haskell and Messerschmitt pre-
sented some simple methods for lost MV recovery. They took
zero MV, the MV of the collocated block in the reference frame,
and the average or the median of the MVs from the spatially
adjacent blocks as the candidate MVs for the lost blocks. The
well known Boundary Matching Algorithm (BMA) proposed
in [2] recovered the lost MV from the candidate MVs which
minimized the total variation between the internal boundary and
the external boundary of the reconstructed block. A variation of
this approach has been adopted in the H.26L (the early version
of H.264) test model and was described in detail in [3]. Some
more sophisticated approaches have also been proposed to better
estimate the lost MVs. For example, Zheng et al. [4] proposed
to recover the lost MVs by using Lagrange interpolation for-
mula while Lie and Gao proposed to find the lost MVs by jointly
optimizing the boundary distortion of the whole slice through
dynamic programming. In order to reduce the complexity, they
adopted a suboptimal alternative enhanced with an adaptive
Kalman-filtering algorithm [5], [6]. More recently, hybrid algo-
rithms have been studied that explored both spatial and temporal
correlations to obtain better recovery of the lost data. In [7], Chen
et al.. proposed a priority-driven region matching algorithm to
exploit the spatial and temporal information. The lost area was
recovered region-by-regionand a priority term is defined to deter-
mine the restoration order. Atzori et al. proposed a concealment
scheme [8] which first replaced the lost block using BMA, and
then applied a mesh-based warping procedure to better match
the block content with the correctly received surrounding areas.
1520-9210/$25.00 © 2008 IEEE
CHEN et al.: VIDEO ERROR CONCEALMENT 3
The aforementioned MV recovery algorithms try to recover
the lost MVs from candidates by enforcing the spatial and/or
temporal smoothness property of the image/video signals. How-
ever, they fail to avoid introducing visible blocking artifacts
in the recovered area, especially under the circumstances of
sudden scene changes as well as fast and complex movement.
Moreover, since transport prioritization has been increasingly
adopted in layered coding, which would transmit the MVs and
other important data with more protection, the MVs may be cor-
rectly received even when the motion compensated residue are
lost. For example, in the data partitioned slice of the emerging
H.264 standard, the coded data is placed in three separate Data
Partitions (A, B, and C), each of which contains a subset of the
coded slice. The MVs are contained in Partition A which could
be given higher priority during transmission. In this case, in-
stead of the recovery of lost MVs, the critical problem becomes
the recovery of the lost motion compensated residue or the re-
duction of the annoying blocking artifacts.
As far as blocking artifacts, i.e., visible discontinuities at
block boundaries, are concerned, one may readily turn to the
post- processing techniques that have been developed to remove
blocking effect due to low bit rate video encoding. This type of
artifacts is visually quite similar to the blocking effects caused
by imperfect lost data reconstruction. Several approaches have
been proposed to alleviate such artifacts, most of which are based
on low pass filtering, AC prediction, projection onto convex
sets (POCS) [9] or more recently, diffusion. As the Gaussian (or
low pass) filter failed to preserve lines and edges, Perona and
Malik [10] proposed to use anisotropic diffusion as an alternative
scheme. The anisotropic diffusion scheme was implemented
via a partial differential equation (PDE) and can successfully
preserve the structural information. Yang and Hu [11] applied
biased anisotropic diffusion scheme to remove the blocking
effect. Although they claimed to unify the artifacts removal and
lost block concealment in one framework and would process
them at the same time, their concealment method was exactly
the same as the maximally smooth recovery method proposed
in [12], which would give blurred recovered block as has been
pointed out in [13]. More recently, Gothandaraman et al.. [14]
proposed to use the method of total variation as an alternative
to biased anisotropic diffusion, and Alter et al.. [15] presented
a deblocking algorithm with weighted total variation later. In
all these schemes, the deblocking problem has been formulated
as an energy minimization problem in which the gradient of
the recovered block, either in the weighted L2-norm (as in
anisotropic diffusion) or in the weighted L1-norm (as in total
variation), would be minimized. Due to the minimization of the
gradient of the recovered block, these methods would produce an
unsatisfactory, blurred interpolation. In a recent work that dealt
with the image editing tasks, Perez et al. proposed a guided in-
terpolation mechanism [16]. Instead of minimizing the gradient
of the unknown function, they introduced a guidance field and
minimized the difference between the gradient of the unknown
function and the guidance field. This mechanism successfully
overcame the blurring problem while ensuring the compliance
of the filled-in image and the surrounding background.
In this paper, we propose a novel two-stage error conceal-
ment scheme for video signals which are compressed in slice
mode with some slices lost during transmission on error-prone
channels.1In the first stage, we propose a novel MV recovery
algorithm, spatio-temporal boundary matching algorithm
(STBMA), to recover the lost MV for each macroblock (MB)
in the lost slices. It works by minimizing a distortion function
which exploits both spatial and temporal smoothness properties
of the video signals. With the recovered MV, we could find the
reference MB in the reference frame for each lost MB. Inspired
by the work in [16], in the second stage, instead of replacing
the lost MB with the corresponding reference MB as most
previous error concealment schemes have done, we propose a
novel PDE-based algorithm to refine the reconstruction. The
proposed PDE-based algorithm could effectively reduce the
blocking artifacts, and meanwhile well preserve the structure
of the reference MB. It works by minimizing, in a weighted
manner, the difference between the gradient field of the recon-
structed MB in current frame and that of the reference MB
in the reference frame under given boundary condition. The
weighting factor produces an anisotropic regulation scheme
which determines the level of regulation according to the de-
gree of local blockiness. Both spatial and temporal correlations
are well-exploited in the proposed scheme. The experimental
results show that the proposed two-stage error concealment
scheme is able to achieve not only higher PSNR but also better
visual quality when compared with the error concealment
feature implemented in the H.264 reference software.
The rest of this paper is organized as follows. We describe the
proposed algorithm in details in Section II. Then, we present the
experimental results in Section III to verify the performance of
the proposed scheme. We conclude this paper in Section IV with
a summary of our algorithm.
II. PROPOSED ERROR CONCEALMENT SCHEME
In this section, we describe the proposed spatio-temporal
error concealment scheme in details. We first introduce the
spatio-temporal boundary matching algorithm (STBMA) for
MV recovery. Then we present the PDE-based algorithm for
lost block reconstruction.
Due to the correlation among adjacent video signals in both
spatial and temporal domain, a reasonable criteria for choosing
good candidate MV is to examine whether the MV can preserve
spatial and temporal continuities of the signals. Motivated by
this intuition, we introduce a novel boundary matching distor-
tion function, in which both spatial and temporal smoothness
properties are well exploited. The MV of each lost MB is recov-
ered through minimizing the distortion function. And the corre-
sponding MB indicated by the recovered MV in the reference
frame is used as the reference MB for the lost MB.
Most previous works recover the lost MB by simply copying
the corresponding reference MB from the reference frame.
However, in this case, the boundaries of the reconstructed
MB are usually not compatible with the spatially surrounding
pixels. Therefore, instead of directly copying the reference
MB, we first compute the gradient field of the reference MB
in the reference frame and then refine the reconstruction by
minimizing the difference between the gradient field of the
reconstructed MB and that of the reference MB.
1This paper is an extension of our previous work [17] and [18].
4IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Like other existing schemes, we also assume that the erroneous
MBs have been detected and a MB-based status map of a frame is
available to specify the position of the “lost”MBs. According to
the status map, all correctly received MBs are decoded first and
then the lost MBs are concealed using the proposed algorithm.
In the following, we only consider scalar image functions, since
the concealment problem can be solved in the same way for each
color component separately,i.e., the Y, U, V components of video
signals.
A. Motion Vector Recovery and Motion Compensation
1) Motion Compensation Using Correct MVs: In H.264, if
the data partitioned slice is adopted and the partition with im-
portant data is transmitted with higher priority, the motion vec-
tors may be correctly decoded although the motion compensated
residues are lost. In this case, the reference MB can be easily
located using the correctly decoded MVs according to the fol-
lowing equation:
(1)
where stands for the reference value for the pixel at
location in frame , while is the reference
frame. is the correctly decoded motion vector.
2) Spatio-TemporalBoundaryMatchingAlgorithm (STBMA):
IftheMVs are lost togetherwiththemotioncompensatedresidues,
MV recovery algorithms should be employed. In H.264 reference
software, the classic boundary matching algorithm (BMA) is uti-
lizedtorecover thelostMVfromthe candidate MV set,whichmin-
imizes the side match distortion between the internal and ex-
ternal boundaries of the reconstructed MB [3]. Here, as shown in
Fig. 1, internal boundaries stand for the boundary pixels of the
MB while external boundaries stand for the surrounding pixels
in the corresponding spatially neighboring MBs. is defined
as the sum of absolute differences between the internal bound-
aries of the candidate block in the reference frame and the ex-
ternal boundaries of the lost block in current frame
(2)
where stands for current frame and is the
corresponding reference frame, the subscripts are
Fig. 1. Illustration of the boundary matching relationship.
short for North, South, West, and East, respectively, as shown
in Fig. 1, M is the size of MB (e.g., in H.264), is
the location of top-left pixel in current lost block,
is the candidate MV which could be zero MV or the MVs of
neighboring adjacent blocks. if the north neighboring
MB in current frame is available, otherwise . So are the
definitions of and . The winning reconstructed MV
is the one which minimizes .
From (2), we can see that BMA utilizes the smoothness prop-
erty between adjacent pixels to recover the lost MV. However,
since only the spatial smoothness property is considered, it may
not be able to select out the best one from the candidate MVs. In
this paper, we present a more general side match distortion func-
tion which considers both spatial and temporal smoothness
properties of the video signals. is defined as a weighted
average of two terms: temporal side match distortion
and spatial side match distortion .
(3)
where the weighting factor is a real number between 0 and 1;
and are defined as follows.
The temporal term is utilized to measure how well
the candidate MV can keep temporal continuity. We observe
that the neighbors of current MB are similar to the neighbors of
the reference MB in the reference frame. Therefore, we define
as the average difference between the external bound-
aries of the candidate reference block in the reference frame and
those of the lost block in current frame
CHEN et al.: VIDEO ERROR CONCEALMENT 5
(4)
With this definition, a good candidate MV should give a small
.
According to the spatial smoothness property of video sig-
nals, the structures in the lost MBs should be compatible to those
of the available spatially neighboring MBs. Therefore, recov-
ering the lost MB using a good candidate MV in some sense
means introducing few structural mismatches at the boundaries.
Here, is utilized to choose such a good MV from can-
didate MVs. We define as the average changes of the
Laplacian estimator along the tangent direction, which measures
the continuity of the isophotes at the boundaries, as shown in (5).
With such a definition, a good candidate MV should give a small
. A similar term is utilized to generate the updating in-
formation for iterative diffusion in the task of image inpainting
[19]. Here, we use it as part of the cost function to select the best
MV from the candidate MV set
(5)
where the symbols M, , and
have the same meanings as those defined in
(1) (see Fig. 1), is the gradient
operator, is the normal oper-
ator whose direction is orthogonal to the gradient direction, and
is the Laplacian operator.
In a typical (and our) implementation, these relevant operators
can be calculated as follows:
(6)
Since current block is totally lost, when computing ,
we first use the candidate reference block to replace current lost
block. In (5), stands for the normalized
gradient of the Laplacian estimator, and
is the normalized vector along the tangent direction. If the
structures across the boundaries are perfectly matched, the two
terms should be orthogonal to each other and the inner product
should be zero. However, if there are some mismatches, the
absolute value of the inner product of the two terms tends
to be large, which would make large. Besides, we
multiply the inner product by the gradient magnitude for
every pixel in (5). There are two reasons for doing this: firstly,
through multiplying , the range of (notice that
without multiplying ) would tend to
be compatible to that of . Secondly, for the pixels at
the internal boundaries as shown in Fig. 1 , stands for
the brightness change across the boundaries, which reflects the
blockiness degree to some extent. According to our observation,
severe blockiness degree tends to have large while slight
blockiness degree usually has small . Therefore, if is
small, even if the absolute value of the inner product is large, it
is still possible that the reference block is a good candidate. On
the contrary, if is large, even if the absolute value of the
inner produce is small, there is still a chance that the reference
block is a bad candidate. So, it is reasonable and necessary to
consider the term of .
In Figs. 2 and 3, we show two examples to demonstrate the
characteristics of : one is a synthesis image (Fig. 2) and
the other is a sub-image cut from “foreman”sequence (Fig. 3).
Due to space limitation, it is difficult to illustrate the whole
MB. And we are only interested in on the boundary of
the MB. In Figs. 2 and 3, we use a small part of the lost MB and
its neighbor MB to explain the effect of . We assume
that the upper 4 8 pixels are from the correctly received MB
and the bottom 4 8 pixels are from the lost MB. For each
example, there are three candidate reconstructions, as shown
in (a-1), (b-1), and (c-1). Obviously (a-1) is the best choice
with perfect structure matching while (b-1) and (c-1) have
different extent of mismatching at the boundary. We will now
show that can automatically select (a-1) as the best
6IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Fig. 2. Example of
D
. (Synthesis image with the upper 4
2
8 pixels from the correctly received MB and the bottom 4
2
8 pixels from the lost MB, the thick
black solid line is the boundary): (a-1)(b-1)(c-1) three candidate reconstructions with perfect structure matching, close structure mismatching, and far structure
mismatching, respectively; (a-2)(b-2)(c-2) the vector field
(
r
f
)
=
(
jr
f
j
)
2jr
f
j
of the three candidates; (a-3)(b-3)(c-3) the vector field
(
r
(
4
f
))
=
(
jr
(
4
f
)
j
)
of the three candidates; (a-4)(b-4)(c-4) displaying the vector fields in the second and third rows together (
D
of (a-1)(b-1)(c-1) are 0, 28.466, and 20.436,
respectively).
candidate for both examples. In Figs. 2 and 3, (a-2)(b-2)(c-2)
illustrate the vector fields of the
corresponding candidates, (a-3)(b-3)(c-3) represent the vector
fields . We put the vector fields of
and together
in (a-4)(b-4)(c-4) to see their inner product. As shown in
CHEN et al.: VIDEO ERROR CONCEALMENT 7
Fig. 3. Example of
D
. (True image with the upper 4
2
8 pixels from the correctly received MB and the bottom 4
2
8 pixels from the lost MB.): (a-1)(b-
1)(c-1) three candidate reconstructions with perfect structure matching, close structure mismatching, and far structure mismatching, respectively; (a-2)(b-2)(c-2)
the vector field
(
r
f
)
=
(
jr
f
j
)
2jr
f
j
of the three candidates; (a-3)(b-3)(c-3) the vector field
(
r
(
4
f
))
=
(
jr
(
4
f
)
j
)
of the three candidates; (a-4)(b-4)(c-4)
displaying the vector fields in the second and third rows together (
D
of (a-1)(b-1)(c-1) are 5.5166, 12.51, and 20.283, respectively).
8IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
(5), we only care about the inner product of the pixels at the
internal boundary of the lost MB. Therefore, we only focus on
the vectors in the black rectangles in (a-4), (b-4), and (c-4).
For the first candidate of the synthesis image, as shown in
Fig. 2(a-4), since is either orthogonal
to or equals zero, the inner product is
zero, which leads to a zero . However, for the second
and third candidates of the synthesis image, as shown in
Figs. 2(b-4) and (c-4), is not orthog-
onal to at some points on the boundary,
which results in a nonzero inner product. And their
are 28.466 and 20.436, respectively. As we mentioned above, a
better candidate should produce smaller . Therefore, for
the synthesis image, the first candidate would be selected as ex-
pected. Similar to the synthesis image, for the three candidates
of the true image shown in Figs. 3(a-1), (b-1) and (c-1), we can
see that the inner product of the first candidate is smaller than
the other two candidates ( for these three candidates are
5.5166, 12.51, and 20.283, respectively). Therefore, Fig. 3(a-1)
would be selected as the best choice for the true image.
The winning MV is the candidate MV, which could be zero
MV or the MVs of the neighboring adjacent blocks, that mini-
mizes . The desirable reference MB in the reference frame
is obtained using this MV.
B. Refining Reconstruction Using Partial Differential
Equation (PDE)
After finding the reference MB in the reference frame, a
straightforward method to reconstruct the lost MB is to directly
copy the pixels from the corresponding reference MB. How-
ever, the reference MB produced by the winning MV is optimal
only in that its cost, which is a measure of the smoothness,
is smaller than those produced by other candidate MVs. It
does not inherently ensure the perfect matching between the
recovered MB and the surrounding boundaries. Therefore,
visible blocking artifacts may still exist in the restored images.
The discontinuity comes partly from the absence of the motion
compensated residue, also called displaced frame difference
(DFD). In [2], Lam et al.. proposed to use the DFD of the
adjacent blocks as substitution for the missing DFD. However,
as the correlation of the DFD among adjacent blocks is not as
high as that of the MV, this method is not quite effective.
Considering the difficulty in recovering the DFD, an alter-
native way is to directly improve the match of the copied refer-
ence MB and the surrounding pixels. In order to achieve this ob-
ject, we abandon the traditional way of using the corresponding
pixel values of the reference MB as the final reconstructed pixel
values for concealment. Instead, we formulate the problem of
recovering the lost MB as an optimization problem.
Before starting the problem formulation, we first define some
notations. As illustrated in Fig. 4, let S, a closed subset of ,
be the definition domain of current frame. Let be a closed
subset of S, which represents the lost MB, and let be the
external boundary of consisting of the correctly received sur-
rounding pixels of the lost MB. Let be an unknown scalar
function defined over and . Let be a known scalar func-
tion defined over Sminus . It is the set of correctly decoded
pixel values. With this definition, we assume that there is only
Fig. 4. Illustration of notations.
one lost MB in current frame. This assumption can be relaxed
since only a subset of , which is defined over , will be used
in later computation. Another assumption we make here is that
the surrounding external boundaries of the lost MB is known as
. This assumption is reasonable considering that the coded
MBs could be packetized in an interleaved manner. Even if this
condition is not met, i.e., one or more adjacent MBs of current
damaged MB has been lost, the proposed PDE-based algorithm
can still be applied successfully according to our discussion in
Section II-D. Let be the gradient vector field of the reference
MB in the reference frame, which is found using the winning
MV obtained in Section II-A.
If the reference frame is correctly received, the boundaries
of the reference MB would be compatible with its surrounding
pixels in the reference frame, i.e., the pixel values change
in a natural manner across the block boundaries. Even if
the reference frame is erroneously received, due to low-pass
filter (deblocking) and post-processing (STBMA+PDE for the
reference frame), it is reasonable for us to assume that the
boundaries of the reference MB would be more compatible
with its surrounding pixels in the reference frame than that in
current frame. Therefore, we would like to push towards
when blocking artifacts is severe at the reconstructed MB.
So, the problem of recovering the lost MB can be formulated as
finding an optimal solution which minimizes the following
objective function:
(7)
According to (7), the recovered should be the function whose
gradient, in the -norm and in a weighted manner, is closest to
the gradient vector field under given boundary condition. If
the coefficient is set to be constant (e.g., ), it
would return to the isotropic guided interpolation scheme pro-
posed in [16], which minimizes . But this
isotropic method might cause problems (e.g., the bleeding ar-
tifact) while reconstructing the lost MB, as shown in our ex-
periment (the red ellipse region in Fig. 12(c)). Through intro-
ducing this spatially varying coefficient, we could better control
the interpolation process according to the degree of local block-
iness. (7) is also a generalization of the anisotropic diffusion
method [11], [14] considering that the idea of anisotropic diffu-
sion method is a special case in which the vector field is set
to be zero, i.e., it minimizes . The zero vector
CHEN et al.: VIDEO ERROR CONCEALMENT 9
field would produce a blurred interpolation. On the contrary, a
well defined nonzero vector field could better preserve the struc-
ture information while alleviating the annoying blocking effects.
With (7), instead of copying the pixel values of , we only try to
preserve the content structure of , which is depicted by the gra-
dient. Besides, we utilize the continuity at the boundary of in
the reference frame to improve the consistency at the boundary
of the reconstructed block.
The solution that minimizes (7) must satisfy the Euler-La-
grange equation2, according to which we have
(8)
The gradient descent method can be used to solve (8). The so-
lution is the steady state solution of following equation:
(9)
where is the iteration index.
The spatially varying coefficient plays an important
role in the interpolation process. When is large, it tries to
push towards the vector field while small allows
to deviate from . According to previous analysis, we
would like to push towards at the locations where the
blocking artifacts is severe. Therefore, we try to give a
large value where the degree of the blocking artifacts is obvious
and make small where there is little blockiness.
According to our observation, the absolute difference between
and , i.e., , can somewhat reflect the degree of
local blockiness. If is small, there tends to be little
blockiness at the boundaries. In this case, we should set
a small value such that only small mount of regulation is per-
formed. When becomes larger, we find that there
might be some kinds of discontinuities. Therefore, we should set
a larger value to perform more regulation to reduce the
blocking artifacts. However, when is even larger, we
find that rather than the blocking artifacts, the discontinuities are
more likely to come from the inherent changes across the edges
of the image, in which case we should make small to
prevent regulation and preserve the original reconstructed value
(e.g., the red ellipse region in Fig. 12(c)).
Following the above descriptions, the spatially varying coef-
ficient can be chosen as and
the function should satisfy the following characteristics.
The should be kept small when is small. Then rises
with the increase of . However, when is larger than
should begin to decrease. The larger the distance from to , the
smaller should be. When greatly exceeds should
be set to be extremely small.
There are many possible choices for , in this paper,
is chosen as follows:
(10)
2If
J
is defined by
J
=
F
(
x; y; f; f ;f
)
dxdy
, then
J
has a sta-
tionary value if the Euler-Lagrange Differential Equation,
(
@F
)
=
(
@f
)
0
(
@
)
=
(
@x
)(
@F
)
=
(
@f
)
0
(
@
)
=
(
@y
)(
@F
)
=
(
@f
)=0
, is satisfied. In
our problem, with the cost function in (7),
F
=
c
(
x; y
)
jr
f
0r
g
j
.
Therefore, we have
0
0
(
@
)
=
(
@x
)[
c
(
x; y
)((
@f
)
=
(
@x
)
0
(
@g
)
=
(
@x
)]
0
(
@
)
=
(
@y
)[
c
(
x;y
)((
@f
)
=
(
@y
)
0
(
@g
)
=
(
@y
))] = 0
, which is equivalent to
r1
[
c
(
x; y
)(
r
f
0r
g
)] = 0
.
Fig. 5. Weighting Factor Function
h
(s)
.
The characteristic of the chosen function is illustrated in
Fig. 5. We can see that it is consistent with what we have stated.
Given the vector field , the reconstructed function
interpolates the specified boundary condition inwards, while
following, in a weighted manner, the spatial variation of as
closely as possible. In the following two subsections, we will
first introduce the numerical implementation of (9). Then we
will discuss how to handle the special case that some of the
boundaries of the lost MB are not available.
C. Numerical Scheme
We implement the discrete version of (9) using a simple
scheme, the form of which is quite similar with the algorithm
described by Perona and Malik [10]. It is discretized on the
discrete pixel grid of the digital image
(11)
where in order to ensure the stability of the nu-
merical scheme, as pointed out in [10], the subscripts
are short for North, South, West, and East, respectively, is
the iteration index. The superscript and subscripts on the square
bracket are applied to all the terms enclosed in it. The symbols
and which indicate the nearest-neighbor dif-
ferences are defined as follows:
(12)
The corresponding terms associated with g are defined in the
same way.
The initial condition of (11) is . The nearest-neighbor
differences at the boundaries of are the pixel value variations
between the internal boundaries and external boundaries of g in
the reference frame, while the nearest-neighbor differences at
the boundaries of are the pixel value changes between the in-
ternal boundaries of , which are changing with iteration, and the
10 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
TABLE I
AVERAGE PSNR PERFORMANCE OF THE RECONSTRUCTED SEQUENCES USING DIFFERENT METHODS
surrounding correctly decoded pixel values. Therefore, at time
, the only difference between and are the values at
the boundaries of the MBs which could trigger the iteration.
Since the numerical scheme is implemented in an iterative
form, the weighting factors should be updated at every iteration
as follows:
(13)
The stopping criterions of the iteration process of (11) are de-
fined as
(14)
where and are two pre-defined thresholds.
D. Special Case for the PDE-Based Refinement With
Erroneous Boundaries
In the above two subsections, we discussed how to use the
PDE-based algorithm to refine the recovered MB generated by
STBMA based on the assumption that the boundary informa-
tion of the lost MB in all four directions is known. When this
assumption is not true, for example, if some boundaries of the
lost MB are not available, we copy the corresponding bound-
aries in the reference frame using the winning MVs generated
by STBMA. If the boundaries in the reference frame are also
unavailable, the gradients of g and f at these boundaries are all
set to be zero. This is reasonable since if some boundaries of the
lost MB are not available, we do not want to utilize the boundary
information in those directions. By copying the corresponding
boundaries in the reference frame, would be the same as
at those boundaries. In such case, since , the
weighting factor at those positions would be zero and the diffu-
sion process would not be applied in those directions.
III. EXPERIMENTAL RESULTS
Although the proposed method is general and can be applied
to any block-based video compression scheme, H.264 is uti-
lized here to evaluate the proposed algorithm. The JM9.0 refer-
ence software is used in the experiment. We compare the perfor-
mance of the proposed algorithm with the inter-frame conceal-
ment feature implemented in the reference software [3], which
is based on the classical BMA [2]. We have tested the algorithms
on six video sequences: Foreman, Table, Tempete, Paris, News
and Carphone. For the first five sequences, both QCIF and CIF
format are tested. For Carphone sequence, we only test the QCIF
format since we do not have the original CIF format sequence.
The test sequences in QCIF(CIF) format are encoded at 10(15)
Hz frame rate. I frames are encoded every fifteen frames and
no B frames are used. Slice mode is enabled. No intra mode is
used in P frames. The quantization parameter is set to be 24.
In the reference software, intra frames are concealed spatially
using weighted pixel averaging [3]. However, this algorithm is
quite ineffective and would make the recovered MB extremely
blurring. Considering the annoying error propagation problem
in the prediction coding scheme, the badly concealed MBs in I
frames would greatly degrade the following P frames. In order
to better compare the proposed algorithm with BMA, both of
which mainly aim at the inter frame concealment, we assume
that the transmission errors only occur in P frames. In order
to simulate the transmission errors, a number of slices are ran-
domly dropped in P frames according to the error pattern. In all
the following experiments, the parameters , and
are set to be 0.5, 0.1, 0.5, 4, 40, and 0.01, respectively. And
the maximum iteration time for the diffusion process in the
second stage at 10, 20, 30, 50, 100, 500 are tested.
In the first experiment, the first 100 frames of the six se-
quences are encoded using slice mode. We assume that one
slice contains one row of MBs. The packet loss rate at 5% and
10% are tested. For each packet loss rate, we simulate 20 dif-
ferent error patterns and evaluate their average performances.
The slices of P frames are firstly dropped according to error
pattern. Then, the erroneously received P frames are concealed
using BMA [2], STBMA-only and STBMA together with PDE
(in all the following descriptions and figures, STBMA means
STBMA-only, while STBMA+PDE stands for first obtaining
the reference MB using STBMA and then applying PDE to gen-
erate the final result). For STBMA+PDE, we have tested the
maximum iteration time at 10, 20, 30, 50, 100, and 500.
Notice that BMA is the method implemented in the H.264 ref-
erence software [3]. Table I shows the PSNR performance of
CHEN et al.: VIDEO ERROR CONCEALMENT 11
TABLE II
AVERAGE SYSTEM TIME USING DIFFERENT METHODS
Fig. 6. “Foreman(CIF)”sequence PSNR performance comparison versus the
frame number while the slice loss rate is 5% (each slice contains one row of
MBs).
the reconstructed sequences using different methods. We can
see that STBMA always performs better than BMA, while with
STBMA+PDE, PSNR can be further improved. On the average
of all six sequences, STBMA can achieve 0.77 dB gain at 5%
slice loss rate and 0.88 dB PSNR gain at 10% slice loss rate,
respectively, compared with BMA. For some sequences, such
as Tempete(QCIF) and Paris(CIF), STBMA+PDE obtain sim-
ilar PSNR performance with STBMA. However, for other se-
quences, such as Foreman(CIF), with STBMA+PDE, the PSNR
performance can be further improved by 0.15 dB. We can
also see that, for STBMA+PDE, the maximum iteration time
is enough for PDE step to achieve its advantages (e.g.,
improving the subjective visual quality and PSNR performance)
for most of the sequences. In TableII, we examine the consumed
time of decoding the whole sequence and concealing the error
using different methods. We can see that the complexity of the
proposed STBMA is almost the same as BMA, while with PDE
step, the complexity is a litter higher, but still acceptable, espe-
cially when is set to be 10.
The major advantage of the PDE step is to generate better
visual quality. The visual quality of the recovered frames of
“Carphone(QCIF)”sequence is illustrated in Fig. 9, where
(a) and (b) are the original and damaged frame, respectively.
Fig. 9(c) shows the result obtained by BMA. Fig. 9(d) shows
Fig. 7. “Foreman(CIF)”sequence PSNR performance comparison versus the
frame number while the slice loss rate is 5% (“Dispersed”FMO mode is used
in the encoder and each slice contain one half of the frame).
Fig. 8. “Foreman(CIF)”sequence PSNR performance comparison versus the
frame number while the slice loss rate is 5% (“Dispersed”FMO mode is used
in the encoder and each slice contain one half of the frame).
the recovered frame using STBMA. And the result generated
by STBMA PDE is shown in Fig. 9(e). Through Fig. 9(c), (d),
and (e), we can see that, compared with BMA, STBMA can
better reconstruct the lost MBs in the region marked by red
12 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Fig. 9. Subjective quality comparison for the “Carphone(QCIF)”sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame;
(b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (24.84 dB); (d) concealed using STBMA (24.85 dB); and
(e) concealed using STBMA+PDE (24.88 dB).
Fig. 10. Subjective quality comparison for the “Foreman(CIF)”sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame; (b)
damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (33.08 dB); (d) concealed using STBMA (34.98 dB); and (e)
concealed using STBMA+PDE (35.10 dB).
ellipse. However, STBMA still introduces some blocking arti-
facts. The PDE step can greatly reduce the blocking artifacts
and generate the best visual quality, which fully demonstrates
the effectiveness of the proposed scheme.
Fig. 6 shows the PSNR performances of the “Foreman(CIF)”
sequence versus the frame number. We can see that
STBMA performs significantly better than BMA, while
with STBMA+PDE, the PSNR performance can be further
improved. In Fig. 10, we examine the visual quality of the
results obtained using different methods. Fig. 10(a) and (b)
show the original and damaged frames, and (c)–(e) show the
results generated by BMA, STBMA, and STBMA PDE,
CHEN et al.: VIDEO ERROR CONCEALMENT 13
Fig. 11. Subjective quality comparison for the “Foreman(CIF)”sequence at 5% slice loss rate (“Dispersed”FMO mode is used in the encoder and each slice
contain one half of the frame). (a) Original frame; (b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (32.43
dB); (d) concealed using STBMA (32.85 dB); and (e) concealed using STBMA+PDE (33.76 dB).
Fig. 12. Subjective quality comparison between isotropic version and anisotropic version of the proposed algorithm: (a) damaged frame with randomly dropped
MBs; (b) concealed using the method proposed in [10]; (c) concealed using the isotropic version of the proposed algorithm
(
c
(
x;y
)=1)
; and (d) concealed using
the anisotropic version of the proposed algorithm
(
c
(
x; y
)=
h
(
jr
g
0r
f
j
))
.
respectively. We can see that STBMA PDE can obtain the
best visual quality with fewest blocking artifacts, especially in
the red ellipse regions. We should notice that the artifacts are
not only introduced from the concealment error of the current
frame, but also from the propagation error of previous frames
due to motion compensation.
We also examine the algorithms when flexible macroblock
order (FMO) is adopted. In this paper, we use “Dispersed”FMO
mode, in which even and odd MBs are encoded in different
slices. “Foreman(CIF)”sequence is used in this experiment. The
slice loss rate is assumed to be 5%. The PSNR performances
are shown in Fig. 7. We can see that the proposed algorithm can
obtain significant PSNR gain compared with BMA. The visual
quality is exemplified in Fig. 11. It is obvious that both BMA and
STBMA introduce server blocking artifacts. However, the PDE
step can dramatically reduce the blocking artifact and achieve
pleasant visual quality.
In the third experiment, we assume that data partitioned tech-
nique is adopted and motion vectors are correctly received at
the decoder. Therefore, it is possible for us to use the correctly
received MVs to recover the lost MBs. Here, we use the MVs
of the top-left blocks as the MVs of the whole MBs. We im-
plement two methods: one is motion compensation using the
correct MVs and taking the pixel values of the reference MB
as the recovered values for the lost data; the other is the pro-
posed method, in which we first use the correct MVs to gen-
erate the reference MB, then perform the PDE-based algorithm.
As shown in Fig. 8, our proposed algorithm can achieve better
PSNR performance.
In the fourth experiment, we evaluate the effect of the
weighting factor adopted in the proposed algorithm. As we
mentioned before, if we set all the weights to be 1 ,
the proposed method becomes the isotropic version and is sim-
ilar to the method proposed in [16]. Fig. 12(c) and (d) illustrate
14 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008
Fig. 13. Subjective quality comparison between BMA and STBMA with full search algorithm. (a) Damaged frame with randomly dropped slices; (b) concealed
using BMA with full search; and (c) concealed using the proposed STBMA with full search.
the results generated by the isotropic and anisotropic version of
the proposed algorithm, respectively. We can see that isotropic
version causes “bleeding”artifacts at the boundary of the white
hat, as shown in the red ellipse region in Fig. 13(c). This is
partly because the corresponding gradient in the vector field
is small there while the actual gradient there should be
slightly larger. In this case, the difference between and
is larger than the threshold . In our anisotropic version, the
weighting factor is kept small, so the initial gradient of is
preserved. However, in the isotropic version, is as large as at
other positions and the pixel values which locates at the edge
of the white hat bleed into the reconstructed MB.
We also evaluate the advantage of using the guided gradient
field . We implement the algorithm proposed in [10], which
is similar to our proposed algorithm but with . As shown
in Fig. 12(b), without the guided field, the reconstructed frame
becomes cartoon-like image. Notice again, the artifacts not only
come from the restoration of the lost MBs in current frame, but
also come from the propagation error of previous frames due to
motion compensation.
In the last experiment, we examine the potential capability of
the proposed STBMA to reconstruct a high continuity for edges
or lines crossing a lost and a correctly received slice. Due to the
complexity consideration, in all the experiments above, the can-
didate MVs we consider in STBMA only include zero MV and
MVs of the neighboring adjacent MBs. In this experiment, we try
full search algorithm for STBMA with the search range 64 64.
The results are shown in Fig. 13. We can see that with the full
search algorithm, the proposed STBMA algorithm can generate
comparable results to Lie and Gao’s method [6, Fig. 5(e)].
IV. CONCLUSION
In this paper, we have developed a novel two-stage error con-
cealment scheme for compressed video sequences which are
corrupted during transmission. In the first stage, we propose a
novel spatio-temporal boundary matching algorithm (STBMA)
to recover the MVs for the lost MBs. By using the recovered
MVs, we find the reference MB in the reference frame for
each lost MB. Then, in the second stage, we use a PDE-based
algorithm to refine the reconstruction of the lost pixels. It works
by minimizing the weighted difference between the gradient
of the lost MB and that of the reference MB obtained in the
first stage under given boundary condition. A well-chosen
weighting factor is used to control the regulation level ac-
cording to the local blockiness degree. The simulation results
fully demonstrate the superiority of the proposed algorithm over
the inter-frame concealment feature implemented in the H.264
reference software which is based on the traditional BMA. The
proposed scheme can effectively reduce the blocking artifact
and well-preserve the inner structure of the recovered MBs. It
can also prevent the undesirable bleeding effect introduced by
the isotropic scheme.
REFERENCES
[1] P. Haskell and D. Messerschmitt, “Resynchronization of motion com-
pensated video affected by ATM cell loss,”in Proc. ICASSP-92: 1992
IEEE Int. Conf. Acoustics, Speech, Signal Process., San Francisco, CA,
1992, vol. 3, pp. 545–548.
[2] W. M. Lam, A. R. Reibman, and B. Liu, “Recovery of lost or erro-
neously received motion vectors,”in Proc. IEEE Int. Conf. Acoustics,
Speech, Signal Process., 1993, vol. 3, pp. 417–420.
[3] Y. K. Wang, M. M. Hannuksela, V. Varsa, A. Hourunranta, and M.
Gabbouj, “The error concealment feature in the H.26L test model,”in
Proc. IEEE Int. Conf. Image Process., 2002, pp. 729–732.
[4] J. H. Zheng and L. P. Chau, “A temporal error concealment algorithm
for H.264 using Lagrange interpolation,”in Proc. IEEE Int. Symp. Cir-
cuits Syst., 2004, pp. 133–136.
[5] Z. W. Gao and W. N. Lie, “Video error concealment by using Kalman-
filtering technique,”in Proc. IEEE Int. Symp.—Circuits Syst., 2004, pp.
69–72.
[6] W. N. Lie and Z. W. Gao, “Video error concealment by intergrating
greedy suboptimization and Kalman filtering techniques,”IEEE Trans.
Circuits Syst. Video Technol., vol. 16, pp. 982–992, 2006.
[7] Y. Chen, X. Sun, F. Wu, Z. Liu, and S. Li, “Spatio-temporal video error
concealment using priority-ranked region-matching,”in Proc. IEEE
Int. Conf. Image Process., 2005, pp. 1050–1053.
[8] L. Atzori, F. G. B. D. Natale, and C. Perra, “A spatio-temporal conceal-
ment technique using boundary matching algorithm and mesh-based
warping (BMA-MBW),”IEEE Trans. Multimedia, vol. 3, no. 3, pp.
326–338, Sep. 2001.
[9] M. Y. Shen and C. C. J. Kuo, “Review of postprocessing techniques for
compression artifact removal,”J. Visual Commun. Image Represent.,
pp. 2–14, 1998.
[10] P. Perona and J. Malik, “Scale-space and edge detection using
anisotropic diffusion,”IEEE Trans. Pattern Anal. Machine Intell., vol.
12, no. 7, pp. 629–639, Jul. 1990.
[11] S. Yang and Y. H. Hu, “Coding artifacts removal using biased
anisotropic diffusion,”in Proc. IEEE Int. Conf. Image Process., 1997,
pp. 346–349.
[12] Y. Wang, Q. F. Zhu, and L. Shaw, “Maximally smooth image recovery
in transform coding,”IEEE Trans. Image Processing, vol. 41, pp.
1544–1551, 1993.
[13] S. Shirani, F. Kossentini, and R. Ward, “Error concealment methods, a
comparative study,”in Proc. Eng. Solutions For Next Millennium, 1999
IEEE Canadian Conf. Electr. Comput. Eng., Edmonton, AB, Canada,
1999, vol. 2, pp. 835–840.
CHEN et al.: VIDEO ERROR CONCEALMENT 15
[14] A. Gothandaraman, R. T. Whitaker, and J. Gregor, “Total variation for
the removal of blocking effects in DCT based encoding,”in Proc. IEEE
Int. Conf. Image Processing, 2001, pp. 455–458.
[15] F. Alter, S. Durand, and J. Froment, “DCT-based compressed images
with weighted total variation,”in Proc. IEEE Int. Conf. Acoustics,
Speech, Signal Processing, 2004, pp. 221–224.
[16] P. Perez, M. Gangnet, and A. Blake, “Poisson image editing,”in Proc.
ACM SIGGRAPH, 2003, pp. 313–318.
[17] Y. Chen, O. Au, C.-W. Ho, and J. Zhou, “Spatio-temporal boundary
matching algorithm for temporal error concealment,”in Proc. IEEE
Int. Symp. Circuits Syst., 2006, pp. 686–689.
[18] Y. Hu, Y. Chen, H. Li, and C. W. Chen, “An improved spatio-temporal
video error concealment algorithm using partial differential equation,”
Proc. SPIE Multimedia Syst. Applicat. VIII, pp. 150–160, 2005.
[19] M. Bertalmo, G. Sapiro, V. Caselles, and C. Ballester, “Image inpaint-
ing,”in Proc. ACM SIGGRAPH, 2000, pp. 417–425.
Yan Chen (S’06) received the Bachelor’s degree
from the University of Science and Technology of
China (USTC) in 2004 and the M.Phil. degree from
the Hong Kong University of Science and Tech-
nology (HKUST) in 2007. He is currently pursuing
the Ph.D. degree in the Department of Electrical
and Computer Engineering, University of Maryland,
College Park.
He was an intern in the Internet Media Group,
Microsoft Research Asia (MSRA), from July to
October 2004. His current research interests are in
image/video coding and processing, wireless communication and networking,
and computer vision.
Yang Hu received the Bachelor’s degree from the
University of Science and Technology of China in
2004. She is currently pursuing the Ph.D. degree in
the Electronic Engineering and Information Science
Department, University of Science and Technology
of China. Since August 2005, she has been a visiting
student with Microsoft Research Asia.
Her current search interests are in multimedia
signal processing, multimedia information retrieval,
machine learning, and pattern recognition.
Oscar C. Au (S’87–M’90–SM’01) received the
B.A.Sc. degree from the University of Toronto,
Toronto, ON, Canada, in 1986, and the M.A. and
Ph.D. degrees from Princeton University, Princeton,
NJ, in 1988 and 1991, respectively.
After being a Postdoctoral Researcher at Princeton
University for one year, he joined the Department of
Electrical and Electronic Engineering, Hong Kong
University of Science and Technology (HKUST), in
1992. He is now an Associate Professor, Director of
Multimedia Technology Research Center (MTrec),
and Director of the Computer Engineering (CPEG) Program at HKUST. His
main research contributions are on video and image coding and processing,
watermarking and steganography, speech and audio processing. Research
topics include fast motion estimation for MPEG-1/2/4, H.261/3/4 and AVS,
optimal and fast suboptimal rate control, mode decision, transcoding, de-
noising, deinterlacing, post-processing, JPEG/JPEG2000 and halftone image
data hiding, etc. He has published over 130 technical journal and conference
papers. His fast motion estimation algorithms were accepted into the ISO/IEC
14496-7 MPEG-4 international video coding standard and the China AVS-M
standard. He has three U.S. patents and is applying for 20+ more on his signal
processing techniques. He has performed forensic investigation and has stood
as an expert witness in the Hong Kong courts many times.
Dr. Au has been an Associate Editor of the IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEM,PART 1 and the IEEE TRANSACTIONS ON CIRCUITS
AND SYSTEMS FOR VIDEO TECHNOLOGY. He is the Chairman of the Technical
Committee on Multimedia Systems and Applications and a member of the
Technical Committee on Video Signal Processing and Communications and the
Technical Committee on DSP of the IEEE Circuits and Systems Society. He
served on the Steering Committee of IEEE TRANSACTIONS ON MULTIMEDIA
and the IEEE International Conference on Multimedia and Expo (ICME). He
also served/will serve on the organizing committee of the IEEE International
Symposium on Circuits and Systems (ISCAS) in 1997, the IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP) in 2003,
the ISO/IEC MPEG 71st Meeting in 2004, the International Conference on
Image Processing (ICIP) in 2010, and other conferences.
Houqiang Li received the B.S., M.S., and Ph.D.
degrees in 1992, 1997, and 2000, respectively, all
from the Department of Electronic Engineering and
Information Science (EEIS), University of Science
and Technology of China (USTC), Hefei, China.
From 2000 to 2002, he did postdoctoral research in
Signal Detection Lab, USTC. Since 2002, he has been
on the faculty of the Department of EEIS, USTC,
where he is currently an Associate Professor. His
current research interests include image and video
coding, image processing, and computer vision.
Chang Wen Chen (F’04) received the B.S. degree
in electrical engineering from University of Science
and Technology of China in 1983, the M.S.E.E. de-
gree from the University of Southern California, Los
Angeles, in 1986, and the Ph.D. degree in electrical
engineering from the University of Illinois at Urbana-
Champaign in 1992.
He has been Allen S. Henry Distinguished Pro-
fessor in the Department of Electrical and Computer
Engineering, Florida Institute of Technology, Mel-
bourne, since July 2003. Previously, he was on the
Faculty of Electrical and Computer Engineering at the University of Missouri-
Columbia from 1996 to 2003, and at the University of Rochester, Rochester,
NY, from 1992 to 1996. From September 2000 to October 2002, he served as
the Head of the Interactive Media Group at the David Sarnoff Research Labo-
ratories, Princeton, NJ. He has also consulted with Kodak Research Labs, Mi-
crosoft Research, Mitsubishi Electric Research Labs, NASA Goddard Space
Flight Center, and Air Force Rome Laboratories.
Dr. Chen has been the Editor-in-Chief for IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY since January 2006. He was
an Associate Editor for IEEE TRANSACTIONS ON MULTIMEDIA from 2002 to
2005 and for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO
TECHNOLOGY from 1997 to 2005. He was also on the Editorial Board of IEEE
Multimedia Magazine from 2003 to 2006 and was an Editor for the Journal
of Visual Communication and Image Representation from 2000 to 2005. He
has been a Guest Editor for the PROCEEDINGS OF THE IEEE (Special Issue on
Distributed Multimedia Communications), a Guest Editor for IEEE JOURNAL
OF SELECTED AREAS IN COMMUNICATIONS (Special Issue on Error-Resilient
Image and Video Transmission), a Guest Editor for IEEE TRANSACTIONS ON
CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (Special Issue on Wireless
Video), a Guest Editor for the Journal of Wireless Communication and Mobile
Computing (Special Issue on Multimedia over Mobile IP), a Guest Editor for
Signal Processing: Image Communications (Special Issue on Recent Advances
in Wireless Video), and a Guest Editor for the Journal of Visual Communication
and Image Representation (Special Issue on Visual Communication in the
Ubiquitous Era). He has also served in numerous technical program committees
for numerous IEEE and other international conferences. He was the Chair
of the Technical Program Committee for ICME2006 held in Toronto, ON,
Canada. He was elected an IEEE Fellow for his contributions in digital image
and video processing, analysis, and communications and an SPIE Fellow for
his contributions in electronic imaging and visual communications. He has
received research awards from NSF, NASA, Air Force, Army, DARPA, and the
Whitaker Foundation. He also received the Sigma Xi Excellence in Graduate
Research Mentoring Award from the University of Missouri-Columbia in
2003. Two of his Ph.D. students have received Best Paper Awards in visual
communication and medical imaging, respectively.