ArticlePDF Available

Video Error Concealment Using Spatio-Temporal Boundary Matching and Partial Differential Equation

February 2008
IEEE Transactions on Multimedia 10(1):2 - 15

February 2008
10(1):2 - 15

DOI:10.1109/TMM.2007.911223

Source
IEEE Xplore

Authors:

Yan Chen

Origin Wireless, Inc.

Oscar C. Au

Houqiang Li

University of Science and Technology of China

Show all 5 authorsHide

Error concealment techniques are very important for video communication since compressed video sequences may be corrupted or lost when transmitted over error-prone networks. In this paper, we propose a novel two-stage error concealment scheme for erroneously received video sequences. In the first stage, we propose a novel spatio-temporal boundary matching algorithm (STBMA) to reconstruct the lost motion vectors (MV). A well defined cost function is introduced which exploits both spatial and temporal smoothness properties of video signals. By minimizing the cost function, the MV of each lost macroblock (MB) is recovered and the corresponding reference MB in the reference frame is obtained using this MV. In the second stage, instead of directly copying the reference MB as the final recovered pixel values, we use a novel partial differential equation (PDE) based algorithm to refine the reconstruction. We minimize, in a weighted manner, the difference between the gradient field of the reconstructed MB in current frame and that of the reference MB in the reference frame under given boundary condition. A weighting factor is used to control the regulation level according to the local blockiness degree. With this algorithm, the annoying blocking artifacts are effectively reduced while the structures of the reference MB are well preserved. Compared with the error concealment feature implemented in the H.264 reference software, our algorithm is able to achieve significantly higher PSNR as well as better visual quality.

Illustration of the boundary matching relationship.

…

Example of D

…

Illustration of notations.

…

Weighting Factor Function h(s).

…

Subjective quality comparison for the " Carphone(QCIF) " sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame; (b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (24.84 dB); (d) concealed using STBMA (24.85 dB); and (e) concealed using STBMA+PDE (24.88 dB).

…

Figures - uploaded by Yan Chen

Content may be subject to copyright.

Content uploaded by Yan Chen

Content may be subject to copyright.

2IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

Video Error Concealment Using Spatio-Temporal

Boundary Matching and Partial Differential Equation

Yan Chen, Student Member, IEEE, Yang Hu, Oscar C. Au, Senior Member, IEEE, Houqiang Li, and

Chang Wen Chen, Fellow, IEEE

Abstract—Error concealment techniques are very important for

video communication since compressed video sequences may be

corrupted or lost when transmitted over error-prone networks. In

this paper, we propose a novel two-stage error concealment scheme

for erroneously received video sequences. In the ﬁrst stage, we

propose a novel spatio-temporal boundary matching algorithm

(STBMA) to reconstruct the lost motion vectors (MV). A well

deﬁned cost function is introduced which exploits both spatial and

temporal smoothness properties of video signals. By minimizing the

cost function, the MV of each lost macroblock (MB) is recovered

and the corresponding reference MB in the reference frame is

obtained using this MV. In the second stage, instead of directly

copying the reference MB as the ﬁnal recovered pixel values, we

use a novel partial differential equation (PDE) based algorithm to

reﬁne the reconstruction. We minimize, in a weighted manner, the

difference between the gradient ﬁeld of the reconstructed MB in

current frame and that of the reference MB in the reference frame

under given boundary condition. A weighting factor is used to

control the regulation level according to the local blockiness degree.

With this algorithm, the annoying blocking artifacts are effectively

reduced while the structures of the referenceMB are well preserved.

Compared with the error concealment feature implemented in

the H.264 reference software, our algorithm is able to achieve

signiﬁcantly higher PSNR as well as better visual quality.

Index Terms—Error concealment, H.264, motion compensation,

partial differential equation.

I. INTRODUCTION

WITH the explosive growth of the Internet and the wire-

less network, video services over these networks are

becoming more and more popular. However, these band-lim-

ited and error-prone channels are unreliable for transmission

of video signals, especially for compressed video transmis-

sion. Although the latest video coding standards such as

Manuscript received September 24, 2006; revised August 8, 2007. The work

of Y. Chen and O. Au was supported in part by the Innovation and Technology

Commission of the Hong Kong Special Administrative Region, China under

Project GHP/033/05. The work of Y. Hu and H. Li was supported by NSFC Gen-

eral Program under Contract 60572067, NSFC General Program under Contract

60672161, and 863 Program under Contract 2006AA01Z317. The associate ed-

itor coordinating the review of this manuscript and approving it for publication

was Dr. Wenjun (Kevin) Zeng.

Y. Chen and O. C. Au are with the Department of Electronic and Com-

puter Engineering, Hong Kong University of Science and Technology, Kowloon,

Hong Kong, China (e-mail: eecyan@ust.hk; eeau@ust.hk).

Y. Hu and H. Li are with the Department of Electronic Engineering and Infor-

mation Science, University of Science and Technology of China, Hefei 230026,

China (e-mail: yanghu@ustc.edu; lihq@ustc.edu).

C. W. Chen is with the Department of Electrical and Computer Engineering,

Florida Institute of Technology, Melbourne, FL 32901 USA (e-mail:cchen@ﬁt.

edu).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TMM.2007.911223

H.261/263/264 and MPEG 1/2/4 can achieve good compres-

sion performance, they also make the compressed video signals

extremely vulnerable to transmission errors. One packet loss

or even one bit error can make the whole slice undecodeable,

which would severely degrade the visual quality of the received

video sequences. A wide range of techniques have been devel-

oped to tackle this problem. Compared with other mechanisms

such as forward error correction (FEC) scheme and automatic

retransmission request (ARQ), error concealment has the ad-

vantages of neither consuming extra bandwidth as in FEC nor

introducing retransmission delay as in ARQ.

Many existing error concealment techniques have made use

of the inherent correlation among spatially and/or temporally

adjacent data to alleviate the inﬂuence of the decoding errors.

Spatial approaches exploit the correlation between neighboring

pixels in the same frame. They interpolate the lost coefﬁcients

from the spatially adjacent data. Temporal approaches, on the

other hand, restore the missing area by exploiting temporal corre-

lation between neighboring frames. An important issue with this

approach is to recover the motion information of the lost blocks.

As a result, a large amount of research has focused on recovery

of motion vectors (MV). In [1], Haskell and Messerschmitt pre-

sented some simple methods for lost MV recovery. They took

zero MV, the MV of the collocated block in the reference frame,

and the average or the median of the MVs from the spatially

adjacent blocks as the candidate MVs for the lost blocks. The

well known Boundary Matching Algorithm (BMA) proposed

in [2] recovered the lost MV from the candidate MVs which

minimized the total variation between the internal boundary and

the external boundary of the reconstructed block. A variation of

this approach has been adopted in the H.26L (the early version

of H.264) test model and was described in detail in [3]. Some

more sophisticated approaches have also been proposed to better

estimate the lost MVs. For example, Zheng et al. [4] proposed

to recover the lost MVs by using Lagrange interpolation for-

mula while Lie and Gao proposed to ﬁnd the lost MVs by jointly

optimizing the boundary distortion of the whole slice through

dynamic programming. In order to reduce the complexity, they

adopted a suboptimal alternative enhanced with an adaptive

Kalman-ﬁltering algorithm [5], [6]. More recently, hybrid algo-

rithms have been studied that explored both spatial and temporal

correlations to obtain better recovery of the lost data. In [7], Chen

et al.. proposed a priority-driven region matching algorithm to

exploit the spatial and temporal information. The lost area was

recovered region-by-regionand a priority term is deﬁned to deter-

mine the restoration order. Atzori et al. proposed a concealment

scheme [8] which ﬁrst replaced the lost block using BMA, and

then applied a mesh-based warping procedure to better match

the block content with the correctly received surrounding areas.

CHEN et al.: VIDEO ERROR CONCEALMENT 3

The aforementioned MV recovery algorithms try to recover

the lost MVs from candidates by enforcing the spatial and/or

temporal smoothness property of the image/video signals. How-

ever, they fail to avoid introducing visible blocking artifacts

in the recovered area, especially under the circumstances of

sudden scene changes as well as fast and complex movement.

Moreover, since transport prioritization has been increasingly

adopted in layered coding, which would transmit the MVs and

other important data with more protection, the MVs may be cor-

rectly received even when the motion compensated residue are

lost. For example, in the data partitioned slice of the emerging

H.264 standard, the coded data is placed in three separate Data

Partitions (A, B, and C), each of which contains a subset of the

coded slice. The MVs are contained in Partition A which could

be given higher priority during transmission. In this case, in-

stead of the recovery of lost MVs, the critical problem becomes

the recovery of the lost motion compensated residue or the re-

duction of the annoying blocking artifacts.

As far as blocking artifacts, i.e., visible discontinuities at

block boundaries, are concerned, one may readily turn to the

post- processing techniques that have been developed to remove

blocking effect due to low bit rate video encoding. This type of

artifacts is visually quite similar to the blocking effects caused

by imperfect lost data reconstruction. Several approaches have

been proposed to alleviate such artifacts, most of which are based

on low pass ﬁltering, AC prediction, projection onto convex

sets (POCS) [9] or more recently, diffusion. As the Gaussian (or

low pass) ﬁlter failed to preserve lines and edges, Perona and

Malik [10] proposed to use anisotropic diffusion as an alternative

scheme. The anisotropic diffusion scheme was implemented

via a partial differential equation (PDE) and can successfully

preserve the structural information. Yang and Hu [11] applied

biased anisotropic diffusion scheme to remove the blocking

effect. Although they claimed to unify the artifacts removal and

lost block concealment in one framework and would process

them at the same time, their concealment method was exactly

the same as the maximally smooth recovery method proposed

in [12], which would give blurred recovered block as has been

pointed out in [13]. More recently, Gothandaraman et al.. [14]

proposed to use the method of total variation as an alternative

to biased anisotropic diffusion, and Alter et al.. [15] presented

a deblocking algorithm with weighted total variation later. In

all these schemes, the deblocking problem has been formulated

as an energy minimization problem in which the gradient of

the recovered block, either in the weighted L2-norm (as in

anisotropic diffusion) or in the weighted L1-norm (as in total

variation), would be minimized. Due to the minimization of the

gradient of the recovered block, these methods would produce an

unsatisfactory, blurred interpolation. In a recent work that dealt

with the image editing tasks, Perez et al. proposed a guided in-

terpolation mechanism [16]. Instead of minimizing the gradient

of the unknown function, they introduced a guidance ﬁeld and

minimized the difference between the gradient of the unknown

function and the guidance ﬁeld. This mechanism successfully

overcame the blurring problem while ensuring the compliance

of the ﬁlled-in image and the surrounding background.

In this paper, we propose a novel two-stage error conceal-

ment scheme for video signals which are compressed in slice

mode with some slices lost during transmission on error-prone

channels.1In the ﬁrst stage, we propose a novel MV recovery

algorithm, spatio-temporal boundary matching algorithm

(STBMA), to recover the lost MV for each macroblock (MB)

in the lost slices. It works by minimizing a distortion function

which exploits both spatial and temporal smoothness properties

of the video signals. With the recovered MV, we could ﬁnd the

reference MB in the reference frame for each lost MB. Inspired

by the work in [16], in the second stage, instead of replacing

the lost MB with the corresponding reference MB as most

previous error concealment schemes have done, we propose a

novel PDE-based algorithm to reﬁne the reconstruction. The

proposed PDE-based algorithm could effectively reduce the

blocking artifacts, and meanwhile well preserve the structure

of the reference MB. It works by minimizing, in a weighted

manner, the difference between the gradient ﬁeld of the recon-

structed MB in current frame and that of the reference MB

in the reference frame under given boundary condition. The

weighting factor produces an anisotropic regulation scheme

which determines the level of regulation according to the de-

gree of local blockiness. Both spatial and temporal correlations

are well-exploited in the proposed scheme. The experimental

results show that the proposed two-stage error concealment

scheme is able to achieve not only higher PSNR but also better

visual quality when compared with the error concealment

feature implemented in the H.264 reference software.

The rest of this paper is organized as follows. We describe the

proposed algorithm in details in Section II. Then, we present the

experimental results in Section III to verify the performance of

the proposed scheme. We conclude this paper in Section IV with

a summary of our algorithm.

II. PROPOSED ERROR CONCEALMENT SCHEME

In this section, we describe the proposed spatio-temporal

error concealment scheme in details. We ﬁrst introduce the

spatio-temporal boundary matching algorithm (STBMA) for

MV recovery. Then we present the PDE-based algorithm for

lost block reconstruction.

Due to the correlation among adjacent video signals in both

spatial and temporal domain, a reasonable criteria for choosing

good candidate MV is to examine whether the MV can preserve

spatial and temporal continuities of the signals. Motivated by

this intuition, we introduce a novel boundary matching distor-

tion function, in which both spatial and temporal smoothness

properties are well exploited. The MV of each lost MB is recov-

ered through minimizing the distortion function. And the corre-

sponding MB indicated by the recovered MV in the reference

frame is used as the reference MB for the lost MB.

Most previous works recover the lost MB by simply copying

the corresponding reference MB from the reference frame.

However, in this case, the boundaries of the reconstructed

MB are usually not compatible with the spatially surrounding

pixels. Therefore, instead of directly copying the reference

MB, we ﬁrst compute the gradient ﬁeld of the reference MB

in the reference frame and then reﬁne the reconstruction by

minimizing the difference between the gradient ﬁeld of the

reconstructed MB and that of the reference MB.

1This paper is an extension of our previous work [17] and [18].

4IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

Like other existing schemes, we also assume that the erroneous

MBs have been detected and a MB-based status map of a frame is

available to specify the position of the “lost”MBs. According to

the status map, all correctly received MBs are decoded ﬁrst and

then the lost MBs are concealed using the proposed algorithm.

In the following, we only consider scalar image functions, since

the concealment problem can be solved in the same way for each

color component separately,i.e., the Y, U, V components of video

signals.

A. Motion Vector Recovery and Motion Compensation

1) Motion Compensation Using Correct MVs: In H.264, if

the data partitioned slice is adopted and the partition with im-

portant data is transmitted with higher priority, the motion vec-

tors may be correctly decoded although the motion compensated

residues are lost. In this case, the reference MB can be easily

located using the correctly decoded MVs according to the fol-

lowing equation:

(1)

where stands for the reference value for the pixel at

location in frame , while is the reference

frame. is the correctly decoded motion vector.

2) Spatio-TemporalBoundaryMatchingAlgorithm (STBMA):

IftheMVs are lost togetherwiththemotioncompensatedresidues,

MV recovery algorithms should be employed. In H.264 reference

software, the classic boundary matching algorithm (BMA) is uti-

lizedtorecover thelostMVfromthe candidate MV set,whichmin-

imizes the side match distortion between the internal and ex-

ternal boundaries of the reconstructed MB [3]. Here, as shown in

Fig. 1, internal boundaries stand for the boundary pixels of the

MB while external boundaries stand for the surrounding pixels

in the corresponding spatially neighboring MBs. is deﬁned

as the sum of absolute differences between the internal bound-

aries of the candidate block in the reference frame and the ex-

ternal boundaries of the lost block in current frame

(2)

where stands for current frame and is the

corresponding reference frame, the subscripts are

Fig. 1. Illustration of the boundary matching relationship.

short for North, South, West, and East, respectively, as shown

in Fig. 1, M is the size of MB (e.g., in H.264), is

the location of top-left pixel in current lost block,

is the candidate MV which could be zero MV or the MVs of

neighboring adjacent blocks. if the north neighboring

MB in current frame is available, otherwise . So are the

deﬁnitions of and . The winning reconstructed MV

is the one which minimizes .

From (2), we can see that BMA utilizes the smoothness prop-

erty between adjacent pixels to recover the lost MV. However,

since only the spatial smoothness property is considered, it may

not be able to select out the best one from the candidate MVs. In

this paper, we present a more general side match distortion func-

tion which considers both spatial and temporal smoothness

properties of the video signals. is deﬁned as a weighted

average of two terms: temporal side match distortion

and spatial side match distortion .

(3)

where the weighting factor is a real number between 0 and 1;

and are deﬁned as follows.

The temporal term is utilized to measure how well

the candidate MV can keep temporal continuity. We observe

that the neighbors of current MB are similar to the neighbors of

the reference MB in the reference frame. Therefore, we deﬁne

as the average difference between the external bound-

aries of the candidate reference block in the reference frame and

those of the lost block in current frame

CHEN et al.: VIDEO ERROR CONCEALMENT 5

(4)

With this deﬁnition, a good candidate MV should give a small

According to the spatial smoothness property of video sig-

nals, the structures in the lost MBs should be compatible to those

of the available spatially neighboring MBs. Therefore, recov-

ering the lost MB using a good candidate MV in some sense

means introducing few structural mismatches at the boundaries.

Here, is utilized to choose such a good MV from can-

didate MVs. We deﬁne as the average changes of the

Laplacian estimator along the tangent direction, which measures

the continuity of the isophotes at the boundaries, as shown in (5).

With such a deﬁnition, a good candidate MV should give a small

. A similar term is utilized to generate the updating in-

formation for iterative diffusion in the task of image inpainting

[19]. Here, we use it as part of the cost function to select the best

MV from the candidate MV set

(5)

where the symbols M, , and

have the same meanings as those deﬁned in

(1) (see Fig. 1), is the gradient

operator, is the normal oper-

ator whose direction is orthogonal to the gradient direction, and

is the Laplacian operator.

In a typical (and our) implementation, these relevant operators

can be calculated as follows:

(6)

Since current block is totally lost, when computing ,

we ﬁrst use the candidate reference block to replace current lost

block. In (5), stands for the normalized

gradient of the Laplacian estimator, and

is the normalized vector along the tangent direction. If the

structures across the boundaries are perfectly matched, the two

terms should be orthogonal to each other and the inner product

should be zero. However, if there are some mismatches, the

absolute value of the inner product of the two terms tends

to be large, which would make large. Besides, we

multiply the inner product by the gradient magnitude for

every pixel in (5). There are two reasons for doing this: ﬁrstly,

through multiplying , the range of (notice that

without multiplying ) would tend to

be compatible to that of . Secondly, for the pixels at

the internal boundaries as shown in Fig. 1 , stands for

the brightness change across the boundaries, which reﬂects the

blockiness degree to some extent. According to our observation,

severe blockiness degree tends to have large while slight

blockiness degree usually has small . Therefore, if is

small, even if the absolute value of the inner product is large, it

is still possible that the reference block is a good candidate. On

the contrary, if is large, even if the absolute value of the

inner produce is small, there is still a chance that the reference

block is a bad candidate. So, it is reasonable and necessary to

consider the term of .

In Figs. 2 and 3, we show two examples to demonstrate the

characteristics of : one is a synthesis image (Fig. 2) and

the other is a sub-image cut from “foreman”sequence (Fig. 3).

Due to space limitation, it is difﬁcult to illustrate the whole

MB. And we are only interested in on the boundary of

the MB. In Figs. 2 and 3, we use a small part of the lost MB and

its neighbor MB to explain the effect of . We assume

that the upper 4 8 pixels are from the correctly received MB

and the bottom 4 8 pixels are from the lost MB. For each

example, there are three candidate reconstructions, as shown

in (a-1), (b-1), and (c-1). Obviously (a-1) is the best choice

with perfect structure matching while (b-1) and (c-1) have

different extent of mismatching at the boundary. We will now

show that can automatically select (a-1) as the best

6IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

Fig. 2. Example of

. (Synthesis image with the upper 4

8 pixels from the correctly received MB and the bottom 4

8 pixels from the lost MB, the thick

black solid line is the boundary): (a-1)(b-1)(c-1) three candidate reconstructions with perfect structure matching, close structure mismatching, and far structure

mismatching, respectively; (a-2)(b-2)(c-2) the vector ﬁeld

(

)

(

)

2jr

of the three candidates; (a-3)(b-3)(c-3) the vector ﬁeld

(

))

(

)

of the three candidates; (a-4)(b-4)(c-4) displaying the vector ﬁelds in the second and third rows together (

of (a-1)(b-1)(c-1) are 0, 28.466, and 20.436,

respectively).

candidate for both examples. In Figs. 2 and 3, (a-2)(b-2)(c-2)

illustrate the vector ﬁelds of the

corresponding candidates, (a-3)(b-3)(c-3) represent the vector

ﬁelds . We put the vector ﬁelds of

and together

in (a-4)(b-4)(c-4) to see their inner product. As shown in

CHEN et al.: VIDEO ERROR CONCEALMENT 7

Fig. 3. Example of

. (True image with the upper 4

8 pixels from the correctly received MB and the bottom 4

8 pixels from the lost MB.): (a-1)(b-

1)(c-1) three candidate reconstructions with perfect structure matching, close structure mismatching, and far structure mismatching, respectively; (a-2)(b-2)(c-2)

the vector ﬁeld

(

)

(

)

2jr

of the three candidates; (a-3)(b-3)(c-3) the vector ﬁeld

(

))

(

)

of the three candidates; (a-4)(b-4)(c-4)

displaying the vector ﬁelds in the second and third rows together (

of (a-1)(b-1)(c-1) are 5.5166, 12.51, and 20.283, respectively).

8IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

(5), we only care about the inner product of the pixels at the

internal boundary of the lost MB. Therefore, we only focus on

the vectors in the black rectangles in (a-4), (b-4), and (c-4).

For the ﬁrst candidate of the synthesis image, as shown in

Fig. 2(a-4), since is either orthogonal

to or equals zero, the inner product is

zero, which leads to a zero . However, for the second

and third candidates of the synthesis image, as shown in

Figs. 2(b-4) and (c-4), is not orthog-

onal to at some points on the boundary,

which results in a nonzero inner product. And their

are 28.466 and 20.436, respectively. As we mentioned above, a

better candidate should produce smaller . Therefore, for

the synthesis image, the ﬁrst candidate would be selected as ex-

pected. Similar to the synthesis image, for the three candidates

of the true image shown in Figs. 3(a-1), (b-1) and (c-1), we can

see that the inner product of the ﬁrst candidate is smaller than

the other two candidates ( for these three candidates are

5.5166, 12.51, and 20.283, respectively). Therefore, Fig. 3(a-1)

would be selected as the best choice for the true image.

The winning MV is the candidate MV, which could be zero

MV or the MVs of the neighboring adjacent blocks, that mini-

mizes . The desirable reference MB in the reference frame

is obtained using this MV.

B. Reﬁning Reconstruction Using Partial Differential

Equation (PDE)

After ﬁnding the reference MB in the reference frame, a

straightforward method to reconstruct the lost MB is to directly

copy the pixels from the corresponding reference MB. How-

ever, the reference MB produced by the winning MV is optimal

only in that its cost, which is a measure of the smoothness,

is smaller than those produced by other candidate MVs. It

does not inherently ensure the perfect matching between the

recovered MB and the surrounding boundaries. Therefore,

visible blocking artifacts may still exist in the restored images.

The discontinuity comes partly from the absence of the motion

compensated residue, also called displaced frame difference

(DFD). In [2], Lam et al.. proposed to use the DFD of the

adjacent blocks as substitution for the missing DFD. However,

as the correlation of the DFD among adjacent blocks is not as

high as that of the MV, this method is not quite effective.

Considering the difﬁculty in recovering the DFD, an alter-

native way is to directly improve the match of the copied refer-

ence MB and the surrounding pixels. In order to achieve this ob-

ject, we abandon the traditional way of using the corresponding

pixel values of the reference MB as the ﬁnal reconstructed pixel

values for concealment. Instead, we formulate the problem of

recovering the lost MB as an optimization problem.

Before starting the problem formulation, we ﬁrst deﬁne some

notations. As illustrated in Fig. 4, let S, a closed subset of ,

be the deﬁnition domain of current frame. Let be a closed

subset of S, which represents the lost MB, and let be the

external boundary of consisting of the correctly received sur-

rounding pixels of the lost MB. Let be an unknown scalar

function deﬁned over and . Let be a known scalar func-

tion deﬁned over Sminus . It is the set of correctly decoded

pixel values. With this deﬁnition, we assume that there is only

Fig. 4. Illustration of notations.

one lost MB in current frame. This assumption can be relaxed

since only a subset of , which is deﬁned over , will be used

in later computation. Another assumption we make here is that

the surrounding external boundaries of the lost MB is known as

. This assumption is reasonable considering that the coded

MBs could be packetized in an interleaved manner. Even if this

condition is not met, i.e., one or more adjacent MBs of current

damaged MB has been lost, the proposed PDE-based algorithm

can still be applied successfully according to our discussion in

Section II-D. Let be the gradient vector ﬁeld of the reference

MB in the reference frame, which is found using the winning

MV obtained in Section II-A.

If the reference frame is correctly received, the boundaries

of the reference MB would be compatible with its surrounding

pixels in the reference frame, i.e., the pixel values change

in a natural manner across the block boundaries. Even if

the reference frame is erroneously received, due to low-pass

ﬁlter (deblocking) and post-processing (STBMA+PDE for the

reference frame), it is reasonable for us to assume that the

boundaries of the reference MB would be more compatible

with its surrounding pixels in the reference frame than that in

current frame. Therefore, we would like to push towards

when blocking artifacts is severe at the reconstructed MB.

So, the problem of recovering the lost MB can be formulated as

ﬁnding an optimal solution which minimizes the following

objective function:

(7)

According to (7), the recovered should be the function whose

gradient, in the -norm and in a weighted manner, is closest to

the gradient vector ﬁeld under given boundary condition. If

the coefﬁcient is set to be constant (e.g., ), it

would return to the isotropic guided interpolation scheme pro-

posed in [16], which minimizes . But this

isotropic method might cause problems (e.g., the bleeding ar-

tifact) while reconstructing the lost MB, as shown in our ex-

periment (the red ellipse region in Fig. 12(c)). Through intro-

ducing this spatially varying coefﬁcient, we could better control

the interpolation process according to the degree of local block-

iness. (7) is also a generalization of the anisotropic diffusion

method [11], [14] considering that the idea of anisotropic diffu-

sion method is a special case in which the vector ﬁeld is set

to be zero, i.e., it minimizes . The zero vector

CHEN et al.: VIDEO ERROR CONCEALMENT 9

ﬁeld would produce a blurred interpolation. On the contrary, a

well deﬁned nonzero vector ﬁeld could better preserve the struc-

ture information while alleviating the annoying blocking effects.

With (7), instead of copying the pixel values of , we only try to

preserve the content structure of , which is depicted by the gra-

dient. Besides, we utilize the continuity at the boundary of in

the reference frame to improve the consistency at the boundary

of the reconstructed block.

The solution that minimizes (7) must satisfy the Euler-La-

grange equation2, according to which we have

(8)

The gradient descent method can be used to solve (8). The so-

lution is the steady state solution of following equation:

(9)

where is the iteration index.

The spatially varying coefﬁcient plays an important

role in the interpolation process. When is large, it tries to

push towards the vector ﬁeld while small allows

to deviate from . According to previous analysis, we

would like to push towards at the locations where the

blocking artifacts is severe. Therefore, we try to give a

large value where the degree of the blocking artifacts is obvious

and make small where there is little blockiness.

According to our observation, the absolute difference between

and , i.e., , can somewhat reﬂect the degree of

local blockiness. If is small, there tends to be little

blockiness at the boundaries. In this case, we should set

a small value such that only small mount of regulation is per-

formed. When becomes larger, we ﬁnd that there

might be some kinds of discontinuities. Therefore, we should set

a larger value to perform more regulation to reduce the

blocking artifacts. However, when is even larger, we

ﬁnd that rather than the blocking artifacts, the discontinuities are

more likely to come from the inherent changes across the edges

of the image, in which case we should make small to

prevent regulation and preserve the original reconstructed value

(e.g., the red ellipse region in Fig. 12(c)).

Following the above descriptions, the spatially varying coef-

ﬁcient can be chosen as and

the function should satisfy the following characteristics.

The should be kept small when is small. Then rises

with the increase of . However, when is larger than

should begin to decrease. The larger the distance from to , the

smaller should be. When greatly exceeds should

be set to be extremely small.

There are many possible choices for , in this paper,

is chosen as follows:

(10)

2If

is deﬁned by

(

x; y; f; f ;f

)

dxdy

, then

has a sta-

tionary value if the Euler-Lagrange Differential Equation,

(

)

(

)

(

)

(

)(

)

(

)

(

)

(

)(

)

(

)=0

, is satisﬁed. In

our problem, with the cost function in (7),

(

x; y

)

Therefore, we have

(

)

(

)[

(

x; y

)((

)

(

)

(

)

(

)]

(

)

(

)[

(

x;y

)((

)

(

)

(

)

(

))] = 0

, which is equivalent to

[

(

x; y

)(

)] = 0

Fig. 5. Weighting Factor Function

(s)

The characteristic of the chosen function is illustrated in

Fig. 5. We can see that it is consistent with what we have stated.

Given the vector ﬁeld , the reconstructed function

interpolates the speciﬁed boundary condition inwards, while

following, in a weighted manner, the spatial variation of as

closely as possible. In the following two subsections, we will

ﬁrst introduce the numerical implementation of (9). Then we

will discuss how to handle the special case that some of the

boundaries of the lost MB are not available.

C. Numerical Scheme

We implement the discrete version of (9) using a simple

scheme, the form of which is quite similar with the algorithm

described by Perona and Malik [10]. It is discretized on the

discrete pixel grid of the digital image

(11)

where in order to ensure the stability of the nu-

merical scheme, as pointed out in [10], the subscripts

are short for North, South, West, and East, respectively, is

the iteration index. The superscript and subscripts on the square

bracket are applied to all the terms enclosed in it. The symbols

and which indicate the nearest-neighbor dif-

ferences are deﬁned as follows:

(12)

The corresponding terms associated with g are deﬁned in the

same way.

The initial condition of (11) is . The nearest-neighbor

differences at the boundaries of are the pixel value variations

between the internal boundaries and external boundaries of g in

the reference frame, while the nearest-neighbor differences at

the boundaries of are the pixel value changes between the in-

ternal boundaries of , which are changing with iteration, and the

10 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

TABLE I

AVERAGE PSNR PERFORMANCE OF THE RECONSTRUCTED SEQUENCES USING DIFFERENT METHODS

surrounding correctly decoded pixel values. Therefore, at time

, the only difference between and are the values at

the boundaries of the MBs which could trigger the iteration.

Since the numerical scheme is implemented in an iterative

form, the weighting factors should be updated at every iteration

as follows:

(13)

The stopping criterions of the iteration process of (11) are de-

ﬁned as

(14)

where and are two pre-deﬁned thresholds.

D. Special Case for the PDE-Based Reﬁnement With

Erroneous Boundaries

In the above two subsections, we discussed how to use the

PDE-based algorithm to reﬁne the recovered MB generated by

STBMA based on the assumption that the boundary informa-

tion of the lost MB in all four directions is known. When this

assumption is not true, for example, if some boundaries of the

lost MB are not available, we copy the corresponding bound-

aries in the reference frame using the winning MVs generated

by STBMA. If the boundaries in the reference frame are also

unavailable, the gradients of g and f at these boundaries are all

set to be zero. This is reasonable since if some boundaries of the

lost MB are not available, we do not want to utilize the boundary

information in those directions. By copying the corresponding

boundaries in the reference frame, would be the same as

at those boundaries. In such case, since , the

weighting factor at those positions would be zero and the diffu-

sion process would not be applied in those directions.

III. EXPERIMENTAL RESULTS

Although the proposed method is general and can be applied

to any block-based video compression scheme, H.264 is uti-

lized here to evaluate the proposed algorithm. The JM9.0 refer-

ence software is used in the experiment. We compare the perfor-

mance of the proposed algorithm with the inter-frame conceal-

ment feature implemented in the reference software [3], which

is based on the classical BMA [2]. We have tested the algorithms

on six video sequences: Foreman, Table, Tempete, Paris, News

and Carphone. For the ﬁrst ﬁve sequences, both QCIF and CIF

format are tested. For Carphone sequence, we only test the QCIF

format since we do not have the original CIF format sequence.

The test sequences in QCIF(CIF) format are encoded at 10(15)

Hz frame rate. I frames are encoded every ﬁfteen frames and

no B frames are used. Slice mode is enabled. No intra mode is

used in P frames. The quantization parameter is set to be 24.

In the reference software, intra frames are concealed spatially

using weighted pixel averaging [3]. However, this algorithm is

quite ineffective and would make the recovered MB extremely

blurring. Considering the annoying error propagation problem

in the prediction coding scheme, the badly concealed MBs in I

frames would greatly degrade the following P frames. In order

to better compare the proposed algorithm with BMA, both of

which mainly aim at the inter frame concealment, we assume

that the transmission errors only occur in P frames. In order

to simulate the transmission errors, a number of slices are ran-

domly dropped in P frames according to the error pattern. In all

the following experiments, the parameters , and

are set to be 0.5, 0.1, 0.5, 4, 40, and 0.01, respectively. And

the maximum iteration time for the diffusion process in the

second stage at 10, 20, 30, 50, 100, 500 are tested.

In the ﬁrst experiment, the ﬁrst 100 frames of the six se-

quences are encoded using slice mode. We assume that one

slice contains one row of MBs. The packet loss rate at 5% and

10% are tested. For each packet loss rate, we simulate 20 dif-

ferent error patterns and evaluate their average performances.

The slices of P frames are ﬁrstly dropped according to error

pattern. Then, the erroneously received P frames are concealed

using BMA [2], STBMA-only and STBMA together with PDE

(in all the following descriptions and ﬁgures, STBMA means

STBMA-only, while STBMA+PDE stands for ﬁrst obtaining

the reference MB using STBMA and then applying PDE to gen-

erate the ﬁnal result). For STBMA+PDE, we have tested the

maximum iteration time at 10, 20, 30, 50, 100, and 500.

Notice that BMA is the method implemented in the H.264 ref-

erence software [3]. Table I shows the PSNR performance of

CHEN et al.: VIDEO ERROR CONCEALMENT 11

TABLE II

AVERAGE SYSTEM TIME USING DIFFERENT METHODS

Fig. 6. “Foreman(CIF)”sequence PSNR performance comparison versus the

frame number while the slice loss rate is 5% (each slice contains one row of

MBs).

the reconstructed sequences using different methods. We can

see that STBMA always performs better than BMA, while with

STBMA+PDE, PSNR can be further improved. On the average

of all six sequences, STBMA can achieve 0.77 dB gain at 5%

slice loss rate and 0.88 dB PSNR gain at 10% slice loss rate,

respectively, compared with BMA. For some sequences, such

as Tempete(QCIF) and Paris(CIF), STBMA+PDE obtain sim-

ilar PSNR performance with STBMA. However, for other se-

quences, such as Foreman(CIF), with STBMA+PDE, the PSNR

performance can be further improved by 0.15 dB. We can

also see that, for STBMA+PDE, the maximum iteration time

is enough for PDE step to achieve its advantages (e.g.,

improving the subjective visual quality and PSNR performance)

for most of the sequences. In TableII, we examine the consumed

time of decoding the whole sequence and concealing the error

using different methods. We can see that the complexity of the

proposed STBMA is almost the same as BMA, while with PDE

step, the complexity is a litter higher, but still acceptable, espe-

cially when is set to be 10.

The major advantage of the PDE step is to generate better

visual quality. The visual quality of the recovered frames of

“Carphone(QCIF)”sequence is illustrated in Fig. 9, where

(a) and (b) are the original and damaged frame, respectively.

Fig. 9(c) shows the result obtained by BMA. Fig. 9(d) shows

Fig. 7. “Foreman(CIF)”sequence PSNR performance comparison versus the

frame number while the slice loss rate is 5% (“Dispersed”FMO mode is used

in the encoder and each slice contain one half of the frame).

Fig. 8. “Foreman(CIF)”sequence PSNR performance comparison versus the

frame number while the slice loss rate is 5% (“Dispersed”FMO mode is used

in the encoder and each slice contain one half of the frame).

the recovered frame using STBMA. And the result generated

by STBMA PDE is shown in Fig. 9(e). Through Fig. 9(c), (d),

and (e), we can see that, compared with BMA, STBMA can

better reconstruct the lost MBs in the region marked by red

12 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

Fig. 9. Subjective quality comparison for the “Carphone(QCIF)”sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame;

(b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (24.84 dB); (d) concealed using STBMA (24.85 dB); and

(e) concealed using STBMA+PDE (24.88 dB).

Fig. 10. Subjective quality comparison for the “Foreman(CIF)”sequence at 5% slice loss rate (each slice contains one row of MBs). (a) Original frame; (b)

damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (33.08 dB); (d) concealed using STBMA (34.98 dB); and (e)

concealed using STBMA+PDE (35.10 dB).

ellipse. However, STBMA still introduces some blocking arti-

facts. The PDE step can greatly reduce the blocking artifacts

and generate the best visual quality, which fully demonstrates

the effectiveness of the proposed scheme.

Fig. 6 shows the PSNR performances of the “Foreman(CIF)”

sequence versus the frame number. We can see that

STBMA performs signiﬁcantly better than BMA, while

with STBMA+PDE, the PSNR performance can be further

improved. In Fig. 10, we examine the visual quality of the

results obtained using different methods. Fig. 10(a) and (b)

show the original and damaged frames, and (c)–(e) show the

results generated by BMA, STBMA, and STBMA PDE,

CHEN et al.: VIDEO ERROR CONCEALMENT 13

Fig. 11. Subjective quality comparison for the “Foreman(CIF)”sequence at 5% slice loss rate (“Dispersed”FMO mode is used in the encoder and each slice

contain one half of the frame). (a) Original frame; (b) damaged frame with randomly dropped slices; (c) concealed using BMA which is adopted in H.264 (32.43

dB); (d) concealed using STBMA (32.85 dB); and (e) concealed using STBMA+PDE (33.76 dB).

Fig. 12. Subjective quality comparison between isotropic version and anisotropic version of the proposed algorithm: (a) damaged frame with randomly dropped

MBs; (b) concealed using the method proposed in [10]; (c) concealed using the isotropic version of the proposed algorithm

(

x;y

)=1)

; and (d) concealed using

the anisotropic version of the proposed algorithm

(

x; y

(

))

respectively. We can see that STBMA PDE can obtain the

best visual quality with fewest blocking artifacts, especially in

the red ellipse regions. We should notice that the artifacts are

not only introduced from the concealment error of the current

frame, but also from the propagation error of previous frames

due to motion compensation.

We also examine the algorithms when ﬂexible macroblock

order (FMO) is adopted. In this paper, we use “Dispersed”FMO

mode, in which even and odd MBs are encoded in different

slices. “Foreman(CIF)”sequence is used in this experiment. The

slice loss rate is assumed to be 5%. The PSNR performances

are shown in Fig. 7. We can see that the proposed algorithm can

obtain signiﬁcant PSNR gain compared with BMA. The visual

quality is exempliﬁed in Fig. 11. It is obvious that both BMA and

STBMA introduce server blocking artifacts. However, the PDE

step can dramatically reduce the blocking artifact and achieve

pleasant visual quality.

In the third experiment, we assume that data partitioned tech-

nique is adopted and motion vectors are correctly received at

the decoder. Therefore, it is possible for us to use the correctly

received MVs to recover the lost MBs. Here, we use the MVs

of the top-left blocks as the MVs of the whole MBs. We im-

plement two methods: one is motion compensation using the

correct MVs and taking the pixel values of the reference MB

as the recovered values for the lost data; the other is the pro-

posed method, in which we ﬁrst use the correct MVs to gen-

erate the reference MB, then perform the PDE-based algorithm.

As shown in Fig. 8, our proposed algorithm can achieve better

PSNR performance.

In the fourth experiment, we evaluate the effect of the

weighting factor adopted in the proposed algorithm. As we

mentioned before, if we set all the weights to be 1 ,

the proposed method becomes the isotropic version and is sim-

ilar to the method proposed in [16]. Fig. 12(c) and (d) illustrate

14 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 1, JANUARY 2008

Fig. 13. Subjective quality comparison between BMA and STBMA with full search algorithm. (a) Damaged frame with randomly dropped slices; (b) concealed

using BMA with full search; and (c) concealed using the proposed STBMA with full search.

the results generated by the isotropic and anisotropic version of

the proposed algorithm, respectively. We can see that isotropic

version causes “bleeding”artifacts at the boundary of the white

hat, as shown in the red ellipse region in Fig. 13(c). This is

partly because the corresponding gradient in the vector ﬁeld

is small there while the actual gradient there should be

slightly larger. In this case, the difference between and

is larger than the threshold . In our anisotropic version, the

weighting factor is kept small, so the initial gradient of is

preserved. However, in the isotropic version, is as large as at

other positions and the pixel values which locates at the edge

of the white hat bleed into the reconstructed MB.

We also evaluate the advantage of using the guided gradient

ﬁeld . We implement the algorithm proposed in [10], which

is similar to our proposed algorithm but with . As shown

in Fig. 12(b), without the guided ﬁeld, the reconstructed frame

becomes cartoon-like image. Notice again, the artifacts not only

come from the restoration of the lost MBs in current frame, but

also come from the propagation error of previous frames due to

motion compensation.

In the last experiment, we examine the potential capability of

the proposed STBMA to reconstruct a high continuity for edges

or lines crossing a lost and a correctly received slice. Due to the

complexity consideration, in all the experiments above, the can-

didate MVs we consider in STBMA only include zero MV and

MVs of the neighboring adjacent MBs. In this experiment, we try

full search algorithm for STBMA with the search range 64 64.

The results are shown in Fig. 13. We can see that with the full

search algorithm, the proposed STBMA algorithm can generate

comparable results to Lie and Gao’s method [6, Fig. 5(e)].

IV. CONCLUSION

In this paper, we have developed a novel two-stage error con-

cealment scheme for compressed video sequences which are

corrupted during transmission. In the ﬁrst stage, we propose a

novel spatio-temporal boundary matching algorithm (STBMA)

to recover the MVs for the lost MBs. By using the recovered

MVs, we ﬁnd the reference MB in the reference frame for

each lost MB. Then, in the second stage, we use a PDE-based

algorithm to reﬁne the reconstruction of the lost pixels. It works

by minimizing the weighted difference between the gradient

of the lost MB and that of the reference MB obtained in the

ﬁrst stage under given boundary condition. A well-chosen

weighting factor is used to control the regulation level ac-

cording to the local blockiness degree. The simulation results

fully demonstrate the superiority of the proposed algorithm over

the inter-frame concealment feature implemented in the H.264

reference software which is based on the traditional BMA. The

proposed scheme can effectively reduce the blocking artifact

and well-preserve the inner structure of the recovered MBs. It

can also prevent the undesirable bleeding effect introduced by

the isotropic scheme.

REFERENCES

[1] P. Haskell and D. Messerschmitt, “Resynchronization of motion com-

pensated video affected by ATM cell loss,”in Proc. ICASSP-92: 1992

IEEE Int. Conf. Acoustics, Speech, Signal Process., San Francisco, CA,

1992, vol. 3, pp. 545–548.

[2] W. M. Lam, A. R. Reibman, and B. Liu, “Recovery of lost or erro-

neously received motion vectors,”in Proc. IEEE Int. Conf. Acoustics,

Speech, Signal Process., 1993, vol. 3, pp. 417–420.

[3] Y. K. Wang, M. M. Hannuksela, V. Varsa, A. Hourunranta, and M.

Gabbouj, “The error concealment feature in the H.26L test model,”in

Proc. IEEE Int. Conf. Image Process., 2002, pp. 729–732.

[4] J. H. Zheng and L. P. Chau, “A temporal error concealment algorithm

for H.264 using Lagrange interpolation,”in Proc. IEEE Int. Symp. Cir-

cuits Syst., 2004, pp. 133–136.

[5] Z. W. Gao and W. N. Lie, “Video error concealment by using Kalman-

ﬁltering technique,”in Proc. IEEE Int. Symp.—Circuits Syst., 2004, pp.

69–72.

[6] W. N. Lie and Z. W. Gao, “Video error concealment by intergrating

greedy suboptimization and Kalman ﬁltering techniques,”IEEE Trans.

Circuits Syst. Video Technol., vol. 16, pp. 982–992, 2006.

[7] Y. Chen, X. Sun, F. Wu, Z. Liu, and S. Li, “Spatio-temporal video error

concealment using priority-ranked region-matching,”in Proc. IEEE

Int. Conf. Image Process., 2005, pp. 1050–1053.

[8] L. Atzori, F. G. B. D. Natale, and C. Perra, “A spatio-temporal conceal-

ment technique using boundary matching algorithm and mesh-based

warping (BMA-MBW),”IEEE Trans. Multimedia, vol. 3, no. 3, pp.

326–338, Sep. 2001.

[9] M. Y. Shen and C. C. J. Kuo, “Review of postprocessing techniques for

compression artifact removal,”J. Visual Commun. Image Represent.,

pp. 2–14, 1998.

[10] P. Perona and J. Malik, “Scale-space and edge detection using

anisotropic diffusion,”IEEE Trans. Pattern Anal. Machine Intell., vol.

12, no. 7, pp. 629–639, Jul. 1990.

[11] S. Yang and Y. H. Hu, “Coding artifacts removal using biased

anisotropic diffusion,”in Proc. IEEE Int. Conf. Image Process., 1997,

pp. 346–349.

[12] Y. Wang, Q. F. Zhu, and L. Shaw, “Maximally smooth image recovery

in transform coding,”IEEE Trans. Image Processing, vol. 41, pp.

1544–1551, 1993.

[13] S. Shirani, F. Kossentini, and R. Ward, “Error concealment methods, a

comparative study,”in Proc. Eng. Solutions For Next Millennium, 1999

IEEE Canadian Conf. Electr. Comput. Eng., Edmonton, AB, Canada,

1999, vol. 2, pp. 835–840.

CHEN et al.: VIDEO ERROR CONCEALMENT 15

[14] A. Gothandaraman, R. T. Whitaker, and J. Gregor, “Total variation for

the removal of blocking effects in DCT based encoding,”in Proc. IEEE

Int. Conf. Image Processing, 2001, pp. 455–458.

[15] F. Alter, S. Durand, and J. Froment, “DCT-based compressed images

with weighted total variation,”in Proc. IEEE Int. Conf. Acoustics,

Speech, Signal Processing, 2004, pp. 221–224.

[16] P. Perez, M. Gangnet, and A. Blake, “Poisson image editing,”in Proc.

ACM SIGGRAPH, 2003, pp. 313–318.

[17] Y. Chen, O. Au, C.-W. Ho, and J. Zhou, “Spatio-temporal boundary

matching algorithm for temporal error concealment,”in Proc. IEEE

Int. Symp. Circuits Syst., 2006, pp. 686–689.

[18] Y. Hu, Y. Chen, H. Li, and C. W. Chen, “An improved spatio-temporal

video error concealment algorithm using partial differential equation,”

Proc. SPIE Multimedia Syst. Applicat. VIII, pp. 150–160, 2005.

[19] M. Bertalmo, G. Sapiro, V. Caselles, and C. Ballester, “Image inpaint-

ing,”in Proc. ACM SIGGRAPH, 2000, pp. 417–425.

Yan Chen (S’06) received the Bachelor’s degree

from the University of Science and Technology of

China (USTC) in 2004 and the M.Phil. degree from

the Hong Kong University of Science and Tech-

nology (HKUST) in 2007. He is currently pursuing

the Ph.D. degree in the Department of Electrical

and Computer Engineering, University of Maryland,

College Park.

He was an intern in the Internet Media Group,

Microsoft Research Asia (MSRA), from July to

October 2004. His current research interests are in

image/video coding and processing, wireless communication and networking,

and computer vision.

Yang Hu received the Bachelor’s degree from the

University of Science and Technology of China in

2004. She is currently pursuing the Ph.D. degree in

the Electronic Engineering and Information Science

Department, University of Science and Technology

of China. Since August 2005, she has been a visiting

student with Microsoft Research Asia.

Her current search interests are in multimedia

signal processing, multimedia information retrieval,

machine learning, and pattern recognition.

Oscar C. Au (S’87–M’90–SM’01) received the

B.A.Sc. degree from the University of Toronto,

Toronto, ON, Canada, in 1986, and the M.A. and

Ph.D. degrees from Princeton University, Princeton,

NJ, in 1988 and 1991, respectively.

After being a Postdoctoral Researcher at Princeton

University for one year, he joined the Department of

Electrical and Electronic Engineering, Hong Kong

University of Science and Technology (HKUST), in

1992. He is now an Associate Professor, Director of

Multimedia Technology Research Center (MTrec),

and Director of the Computer Engineering (CPEG) Program at HKUST. His

main research contributions are on video and image coding and processing,

watermarking and steganography, speech and audio processing. Research

topics include fast motion estimation for MPEG-1/2/4, H.261/3/4 and AVS,

optimal and fast suboptimal rate control, mode decision, transcoding, de-

noising, deinterlacing, post-processing, JPEG/JPEG2000 and halftone image

data hiding, etc. He has published over 130 technical journal and conference

papers. His fast motion estimation algorithms were accepted into the ISO/IEC

14496-7 MPEG-4 international video coding standard and the China AVS-M

standard. He has three U.S. patents and is applying for 20+ more on his signal

processing techniques. He has performed forensic investigation and has stood

as an expert witness in the Hong Kong courts many times.

Dr. Au has been an Associate Editor of the IEEE TRANSACTIONS ON

CIRCUITS AND SYSTEM,PART 1 and the IEEE TRANSACTIONS ON CIRCUITS

AND SYSTEMS FOR VIDEO TECHNOLOGY. He is the Chairman of the Technical

Committee on Multimedia Systems and Applications and a member of the

Technical Committee on Video Signal Processing and Communications and the

Technical Committee on DSP of the IEEE Circuits and Systems Society. He

served on the Steering Committee of IEEE TRANSACTIONS ON MULTIMEDIA

and the IEEE International Conference on Multimedia and Expo (ICME). He

also served/will serve on the organizing committee of the IEEE International

Symposium on Circuits and Systems (ISCAS) in 1997, the IEEE International

Conference on Acoustics, Speech and Signal Processing (ICASSP) in 2003,

the ISO/IEC MPEG 71st Meeting in 2004, the International Conference on

Image Processing (ICIP) in 2010, and other conferences.

Houqiang Li received the B.S., M.S., and Ph.D.

degrees in 1992, 1997, and 2000, respectively, all

from the Department of Electronic Engineering and

Information Science (EEIS), University of Science

and Technology of China (USTC), Hefei, China.

From 2000 to 2002, he did postdoctoral research in

Signal Detection Lab, USTC. Since 2002, he has been

on the faculty of the Department of EEIS, USTC,

where he is currently an Associate Professor. His

current research interests include image and video

coding, image processing, and computer vision.

Chang Wen Chen (F’04) received the B.S. degree

in electrical engineering from University of Science

and Technology of China in 1983, the M.S.E.E. de-

gree from the University of Southern California, Los

Angeles, in 1986, and the Ph.D. degree in electrical

engineering from the University of Illinois at Urbana-

Champaign in 1992.

He has been Allen S. Henry Distinguished Pro-

fessor in the Department of Electrical and Computer

Engineering, Florida Institute of Technology, Mel-

bourne, since July 2003. Previously, he was on the

Faculty of Electrical and Computer Engineering at the University of Missouri-

Columbia from 1996 to 2003, and at the University of Rochester, Rochester,

NY, from 1992 to 1996. From September 2000 to October 2002, he served as

the Head of the Interactive Media Group at the David Sarnoff Research Labo-

ratories, Princeton, NJ. He has also consulted with Kodak Research Labs, Mi-

crosoft Research, Mitsubishi Electric Research Labs, NASA Goddard Space

Flight Center, and Air Force Rome Laboratories.

Dr. Chen has been the Editor-in-Chief for IEEE TRANSACTIONS ON

CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY since January 2006. He was

an Associate Editor for IEEE TRANSACTIONS ON MULTIMEDIA from 2002 to

2005 and for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO

TECHNOLOGY from 1997 to 2005. He was also on the Editorial Board of IEEE

Multimedia Magazine from 2003 to 2006 and was an Editor for the Journal

of Visual Communication and Image Representation from 2000 to 2005. He

has been a Guest Editor for the PROCEEDINGS OF THE IEEE (Special Issue on

Distributed Multimedia Communications), a Guest Editor for IEEE JOURNAL

OF SELECTED AREAS IN COMMUNICATIONS (Special Issue on Error-Resilient

Image and Video Transmission), a Guest Editor for IEEE TRANSACTIONS ON

CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (Special Issue on Wireless

Video), a Guest Editor for the Journal of Wireless Communication and Mobile

Computing (Special Issue on Multimedia over Mobile IP), a Guest Editor for

Signal Processing: Image Communications (Special Issue on Recent Advances

in Wireless Video), and a Guest Editor for the Journal of Visual Communication

and Image Representation (Special Issue on Visual Communication in the

Ubiquitous Era). He has also served in numerous technical program committees

for numerous IEEE and other international conferences. He was the Chair

of the Technical Program Committee for ICME2006 held in Toronto, ON,

Canada. He was elected an IEEE Fellow for his contributions in digital image

and video processing, analysis, and communications and an SPIE Fellow for

his contributions in electronic imaging and visual communications. He has

received research awards from NSF, NASA, Air Force, Army, DARPA, and the

Whitaker Foundation. He also received the Sigma Xi Excellence in Graduate

Research Mentoring Award from the University of Missouri-Columbia in

2003. Two of his Ph.D. students have received Best Paper Awards in visual

communication and medical imaging, respectively.

CRC-based error correction methods and algorithms applied to video communications over vehicular and IoT wireless networks

Thesis

Feb 2021

Vivien Boussard

Video content transmission constitute the main category of data transmitted in the world nowadays. The quality of the transmitted content is ever increasing, thanks to the deployment of networks able to support huge traffic loads at high speeds, along with strategies to reduce the amount of data necessary to carry video information, based on more efficient video encoders. However, the quality of the video stream perceived by the end user can be greatly degraded by transmission errors. In fact, a packet can either be corrupted or lost during the transmission due to channel impairments, which result in missing video information that must be recovered. Several strategies exist to recover such information. Retransmission of the damaged packet can be performed. However, this option is not always valid under real time constraints as in video streaming, or to avoid increasing the global network load. To recover missing information, error correction methods can be applied at the receiver’s side. In this thesis, we propose error correction methods at the receiver’s side based on the properties of the widely used error detection code Cyclic Redundancy Check (CRC). These methods use the syndrome of a corrupted packet computed at the receiver to produce the exhaustive list of error patterns that could have resulted in such syndrome, containing up to a defined number of errors. We propose different approaches to achieve such error correction. First, we present an arithmetic-based approach which performs logical operations (XORs) on the fly and does not need memory storage to operate. The second approach we propose is an optimized table approach, in which the repetitive computations of the first method are precomputed prior to the communication and stored in efficiently constructed tables. It allows this method to be significantly faster at the cost of memory storage. The error correction validation is performed through a two-step process, which cross-checks the candidate list with another error detection code, the checksum, and then validates the syntax of the encoded packet to test its decodability. We test these new methods with wireless transmission simulations of H.264 and HEVC compressed video content over Wi-Fi 802.11p and Bluetooth Low energy channels. The latter allows the most significant error correction rates and the reconstruction of a near-optimal video even when the channel’s quality starts to decrease.

A review of temporal video error concealment techniques and their suitability for HEVC and VVC

Article

Full-text available

Jan 2021
MULTIMED TOOLS APPL

Despite of the recent progresses in reliable and high bandwidth communication, packet loss is still probable and needs special attention in real-time video streaming applications. Congestion and bit error rate, which sometimes are more than the protection capability of the channel codes, are the sources of packet loss in video communication. One common approach to deal with video packet loss is to use error concealment techniques, which estimate the non-received data as close as possible to the actual data. This article reviews the temporal video error concealment methods that have been developed over the past 30 years. The techniques are categorized into 8 groups, and the methods are covered with enough details. The strengths and weaknesses of the 8 groups are also tabulated, and some suggestions for future work and open areas for research are provided.

VQProtect: Lightweight Visual Quality Protection for Error-Prone Selectively Encrypted Video Streaming

Article

Full-text available

May 2022
Entropy

Mobile multimedia communication requires considerable resources such as bandwidth and efficiency to support Quality-of-Service (QoS) and user Quality-of-Experience (QoE). To increase the available bandwidth, 5G network designers have incorporated Cognitive Radio (CR), which can adjust communication parameters according to the needs of an application. The transmission errors occur in wireless networks, which, without remedial action, will result in degraded video quality. Secure transmission is also a challenge for such channels. Therefore, this paper’s innovative scheme “VQProtect” focuses on the visual quality protection of compressed videos by detecting and correcting channel errors while at the same time maintaining video end-to-end confidentiality so that the content remains unwatchable. For the purpose, a two-round secure process is implemented on selected syntax elements of the compressed H.264/AVC bitstreams. To uphold the visual quality of data affected by channel errors, a computationally efficient Forward Error Correction (FEC) method using Random Linear Block coding (with complexity of O(k(n−1)) is implemented to correct the erroneous data bits, effectively eliminating the need for retransmission. Errors affecting an average of 7–10% of the video data bits were simulated with the Gilbert–Elliot model when experimental results demonstrated that 90% of the resulting channel errors were observed to be recoverable by correctly inferring the values of erroneous bits. The proposed solution’s effectiveness over selectively encrypted and error-prone video has been validated through a range of Video Quality Assessment (VQA) metrics.

An Improved Motion Vector Estimation Approach for Video Error Concealment Based on the Video Scene Analysis

Article

Full-text available

Dec 2020

In order to enhance the accuracy of the motion vector (MV) estimation and also reduce the error propagation issue during the estimation, in this paper, a new adaptive error concealment (EC) approach is proposed based on the information extracted from the video scene. In this regard, the motion information of the video scene around the degraded MB is first analyzed to estimate the motion type of the degraded MB. If the neighboring MBs possess uniform motion, the degraded MB imitates the behavior of neighboring MBs by choosing the MV of the collocated MB. Otherwise, the lost MV is estimated through the second proposed EC technique (i.e., IOBMA). In the IOBMA, unlike the conventional boundary matching criterion-based EC techniques, not only each boundary distortion is evaluated regarding both the luminance and the chrominance components of the boundary pixels, but also the total boundary distortion corresponding to each candidate MV is calculated as the weighted average of the available boundary distortions. Compared with the state-of-the-art EC techniques, the simulation results indicate the superiority of the proposed EC approach in terms of both the objective and subjective quality assessments.

Modified Boundary Matching algorithm for Error detection and Error Concealment for Video Communication

Article

Full-text available

Aug 2019

As the demand of video transmission over communication network has grown rapidly, the data compression and error correction in video processing have shown significant improvement day by day. When the error occurs in a single frame, the visual quality of the subsequent frames gets degraded due to error propagation. Thus, the error control techniques are required for the recovery. Concealment of error at the receiver (decoder) side feats the spatial and temporal characteristics of the frame. Without the requirement of the extra bandwidth and retransmission delay, it enhances the quality of the reconstructed video. However, the output of the error concealment may get affected if the error located before is misleading. Thus error detection also plays an important role while reconstructing the video. However, the output of the error concealment may get affected if the error located before is misleading. This paper proposes error detection and concealment approach for the recovery of lost Macro Block (MB) in video. The spatio-temporal techniques has been used for the error detection followed by the MB type decision applied for classifying the damaged macro block .For the concealment method a new method i.e. Modified Spatio-Temporal Boundary Matching Algorithm (MSTBMA) has been proposed. The proposed work is compared with various existing method for spatial and temporal error concealment. The comparison has been done for various types of error such as block error (single, multiple), burst error and random error generated by the software. Performance is improves in terms of PSNR and visual quality by considering the type of lost MB.

Video Error Concealment Using Particle Swarm Optimization

Chapter

Sep 2022

Video transmitting over wired or wireless channels such as internet is the area of research because of its fast growth. There are more chances of loss of packets in wireless medium. In existing video recovering methods, either there is a delay as packets are sending it again or redundancy of data, Video Error Concealment (VEC) is the method used for minimizing the errors in the video due to any transmission errors or addition of noise. There are different domains that are used for error concealment such as temporal, spatial, and spatio‐temporal. To achieve error concealment techniques, there are different algorithms such as Boundary Matching Algorithm, Frequency Selective Extrapolation, and Patch Matching. The proposed method is a novel method in the spatio‐temporal domain. It can significantly improve the subjective and objective video quality. Hence, spatio‐temporal algorithm is adopted over other domains. There are many algorithms for VEC. The optimized algorithms should be used for obtaining better quality of videos. Particle Swarm Optimization (PSO) is one of the best optimized bio‐inspired algorithms. This PSO technique can be used to conceal the errors in different formats of videos. Correlation is used for detection of errors in the videos, and each error frame is concealed using PSO algorithm in MATLAB. This was tested for different standard videos and different types and variety of errors for single, multiple, and sequential errors. In comparison to error videos, parameters including PSNR, SSIM, and Entropy improved for concealed videos, while MSE decreased. The results clearly indicate improvement in quality of videos. The errors in the video should be recovered as it is used in many applications such as in internet video streaming, mobile phone, TV, and video conference and in medical areas such as MRI and satellite transmissions.

Enhanced CRC-based correction of multiple errors with candidate validation

Article

Sep 2021
SIGNAL PROCESS-IMAGE

Cyclic redundancy checks (CRC) are widely used in transmission protocols to detect whether errors have altered a transmitted packet. It has been demonstrated in the literature that CRC can also be used to correct transmission errors. In this paper, we propose an improvement of the state-of-the-art CRC-based error correction method. The proposed approach is designed to significantly increase the error correction capabilities of the previous method, by handling a greater part of error cases through the management of candidate lists and using additional validations. Simulations and results for wireless video communications over 802.11p and Bluetooth Low Energy illustrate the Peak Signal-to-Noise Ratio (PSNR) and visual quality gains achieved with the proposed approach versus the state-of-the-art and traditional approaches. These gains range on average from 1.6 dB to 7.3 dB over Bluetooth Low Energy channels with Eb/No ratio of 10 dB and 8 dB, respectively.

A Robust Service-Oriented Video Communication Architecture in Poor Wireless Networks

Conference Paper

Dec 2020

Robust H.264 Video Decoding Using Crc-Based Single Error Correction And Non-Desynchronizing Bits Validation

Conference Paper

Oct 2020

Prior Guided GAN Based Semantic Inpainting

Conference Paper

Jun 2020

Spatio-temporal boundary matching algorithm for temporal error concealment

Conference Paper

Full-text available

Jun 2006

In this paper, a novel temporal error concealment algorithm, called spatio-temporal boundary matching algorithm (STBMA), is proposed to recover the information lost in the video transmission. Different from the classical boundary matching algorithm (BMA), which just considers the spatial smoothness property, the proposed algorithm introduces a new distortion function to exploit both the spatial and temporal smoothness properties to recover the lost motion vector (MV) from candidates. The new distortion function involves two terms: spatial distortion term and temporal distortion term. Since both the spatial and temporal smoothness properties are involved, the proposed method can better minimize the distortion of the recovered block and recover more accurate MV. The proposed algorithm has been tested on H.264 reference software JM 9.0. The experimental results demonstrate the proposed algorithm can obtain better PSNR performance and visual quality, compared with BMA which is adopted in H.264

Spatio-temporal video error concealment using priority-ranked region-matching

Conference Paper

Full-text available

Oct 2005

When transmitted over error-prone networks, compressed video sequences may be received with errors. In this paper, we propose a priority-ranked region-matching algorithm to recover the "lost" area of the decoded frames, in which both temporal and spatial correlations of the video sequence are exploited. In the proposed scheme, we first calculate the priorities of all edge pixels of the "lost" area and generate a priority-ranked region group. Then according to their priorities, the regions in the group will search their best matching regions temporally and spatially. Finally, the "lost" area is recovered progressively by the corresponding pixels in the matching regions. Experimental results show that the proposed scheme achieves higher PSNR as well as better video quality in comparison with the method adopted in H.264.

The error concealment feature in the H.26L test model

Conference Paper

Full-text available

Jan 2002
Image Process

This paper presents the error concealment (EC) feature implemented by the authors in the test model of the draft ITU-T video coding standard H.26L. The selected EC algorithms are based on weighted pixel value averaging for INTRA. pictures and boundary-matching-based motion vector recovery for INTER pictures. The specific concealment strategy and some special methods, including handling of B-pictures, multiple reference frames and entire frame losses, are described. Both subjective and objective results are given based on simulations under Internet conditions. The feature was adopted and is now included in the latest H.26L reference software TML-9.0.

An improved spatio-temporal video error concealment algorithm using partial differential equation

Article

Oct 2005
Proceedings of SPIE

The application of error concealment in video communication is very important when compressed video sequences are transmitted over error-prone networks and erroneously received. In this paper, we propose a novel error concealment scheme, in which the concealment problem is formulated as minimizing, in a weighted manner, the difference between the gradient of the reconstructed data and a prescribed vector field under given boundary condition. Instead of using the motion compensated block as the final recovered pixel values, we use the gradient of the motion compensated block together with the surrounding correctly decoded pixels of the damaged block to reconstruct the lost data. Both temporal and spatial correlations of the video signals are exploited in the proposed scheme. A well designed weighting factor is used to control the regulation level at a desired direction according to the local blockiness degree at the boundaries of the recovered block. The experimental results show that the proposed algorithm is able to achieve higher PSNR as well as better visual quality in comparison with the error concealment feature implemented in the H.264 reference software. The blocking effects are greatly alleviated while the structural information in the interior of the recovered block is well preserved.

Resynchronization of motion compensated video affected by ATM cell loss

Article

Mar 1992

Techniques for resynchronizing motion-compensation-based coders and strategies for the recovery of lost motion vectors are discussed. Leaky-difference resynchronization yields perceptually pleasing video sequences even at fairly high cell loss rates. Future study is needed to determine optimal data-dependent or network-state-dependent conditional resynchronization strategies. Lost motion vectors can be predicted accurately with either the median of intraframe neighboring vectors or the corresponding past-frame vector. The replacement of lost motion vectors with estimates such as these can significantly improve the quality of video affected by cell loss.

Coding artifacts removal using biased anisotropic diffusion

Conference Paper

Nov 1997

Biased anisotropic diffusion is applied to the coding artifacts removal of the DCT based codec. It is formulated as a cost minimization problem. The weighting factors of the cost function are controlled such that the solution removes the blocking effect and conceals the block losses. It has an advantage over other postprocessing schemes because it handles the discontinuity of the image, smoothes the image selectively, and takes the visual masking in to account. Features needed for the weighting factors are extracted directly from the DCT coefficients to reduce the computational complexity

Review of Postprocessing Techniques for Compression Artifact Removal

Article

Mar 1998
Proceedings of SPIE

Low bit rate image/video coding is essential for many visual communication applications. When bit rates become low, most compression algorithms yield visually annoying artifacts that highly degrade the perceptual quality of image and video data. To achieve high bit rate reduction while maintaining the best possible perceptual quality, postprocessing techniques provide one attractive solution. In this paper, we provide a review and analysis of recent developments in postprocessing techniques. Various types of compression artifacts are discussed first. Then, two types of postprocessing algorithms based on image enhancement and restoration principles are reviewed. Finally, current bottlenecks and future research directions in this field are addressed.

Poisson image editing

Article

Jul 2003

Using generic interpolation machinery based on solving Poisson equations, a variety of novel tools are introduced for seamless editing of image regions. The first set of tools permits the seamless importation of both opaque and transparent source image regions into a destination region. The second set is based on similar mathematical ideas and allows the user to modify the appearance of the image seamlessly, within a selected region. These changes can be arranged to affect the texture, the illumination, and the color of objects lying in the region, or to make tileable a rectangular selection.

Video error concealment by using Kalman-filtering technique

Conference Paper

Jun 2004

This paper addresses the technique of error concealment to recover the video quality at decoders under transmission errors. We propose Kalman filtering as a post-processing technique to traditional boundary matching algorithm (BMA) which estimates the motion vector (MV) of a corrupted macroblock by using boundary pixels of the top and bottom-adjacent MBs as the reference. Due to less information from boundary pixels, MVs estimated by using BMA are mostly inaccurate. Experiments show that by proper mathematical modeling, the Kalman filter is able to filter out the inherit noise so that the recovered MVs lead to a quality improvement of 0.4 dB∼0.72 dB for our test sequences.

A temporal error concealment algorithm for H.264 using Lagrange interpolation

Conference Paper

Jun 2004

In this paper, we propose an efficient temporal error concealment algorithm for the new coding standard H.264, which makes use of the Lagrange interpolation formula. In H.264, a 16×16 inter macroblock can be divided into different block shapes for motion estimation, and each block has its own motion vector. For nature video, the motion vectors within a small area are correlative. Since the motion vector in H.264 covers a smaller area than previous coding standards, the correlation between neighboring motion vectors increases. We can use the Lagrange interpolation formula to constitute a polynomial that describes the motion tendency of motion vectors, which are next to the lost motion vector, and use this polynomial to recover the lost motion vector. The simulation result shows that our algorithm can efficiently improve the visual quality of corrupted videos.

Video Error Concealment Using Spatio-Temporal Boundary Matching and Partial Differential Equation

Abstract and Figures

Recommended publications

Error concealment techniques for digital TV

Efficient mode selection algorithm using image distortion for H.264 video encoder

Efficient Motion Estimation Algorithm for Advanced Video Coding

Temporal Error Concealment Based on Variable MB Mode in H.264 Video Transmission over Error-Prone Ch...