ArticlePDF Available

Packet Video Error Concealment With Auto Regressive Model

January 2012
IEEE Transactions on Circuits and Systems for Video Technology 22(1):12-27

January 2012
22(1):12-27

DOI:10.1109/TCSVT.2011.2130450

Source
DBLP

Authors:

Yongbing Zhang

Tsinghua University

Xinguang Xiang

Nanjing University of Science and Technology

Debin Zhao

Harbin Institute of Technology

Siwei Ma

Peking University

Show all 5 authorsHide

In this paper, auto regressive (AR) model is applied to error concealment for block-based packet video coding. In the proposed error concealment scheme, the motion vector for each corrupted block is first derived by any kind of recovery algorithms. Then each pixel within the corrupted block is replenished as the weighted summation of pixels within a square centered at the pixel indicated by the derived motion vector in a regression manner. Two block-dependent AR coefficient derivation algorithms under spatial and temporal continuity constraints are proposed respectively. The first one derives the AR coefficients via minimizing the summation of the weighted square errors within all the available neighboring blocks under the spatial continuity constraint. The confidence weight of each pixel sample within the available neighboring blocks is inversely proportional to the distance between the sample and the corrupted block. The second one derives the AR coefficients by minimizing the summation of the weighted square errors within an extended block in the previous frame along the motion trajectory under the temporal continuity constraint. The confidence weight of each extended sample is inversely proportional to the distance toward the corresponding motion aligned block whereas the confidence weight of each sample within the motion aligned block is set to be one. The regression results generated by the two algorithms are then merged to form the ultimate restorations. Various experimental results demonstrate that the proposed error concealment strategy is able to improve both the objective and subjective quality of the replenished blocks compared to other methods.

Proposed AR model based error concealment.

…

Auto-regressive model.

…

Probabilistic confidence magnitude within neighboring blocks.

…

Temporal continuity constraint.

…

Probabilistic confidence magnitude within an extended 4 × 4 block.

…

Figures - uploaded by Wen Gao

Content may be subject to copyright.

Content uploaded by Wen Gao

Content may be subject to copyright.

12 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

Packet Video Error Concealment With

Auto Regressive Model

Yongbing Zhang, Xinguang Xiang, Debin Zhao, Siwe Ma, Student Member, IEEE, and Wen Gao, Fellow, IEEE

Abstract—In this paper, auto regressive (AR) model is applied

to error concealment for block-based packet video coding. In

the proposed error concealment scheme, the motion vector for

each corrupted block is ﬁrst derived by any kind of recovery

algorithms. Then each pixel within the corrupted block is

replenished as the weighted summation of pixels within a square

centered at the pixel indicated by the derived motion vector in a

regression manner. Two block-dependent AR coefﬁcient deriva-

tion algorithms under spatial and temporal continuity constraints

are proposed respectively. The ﬁrst one derives the AR coefﬁcients

via minimizing the summation of the weighted square errors

within all the available neighboring blocks under the spatial

continuity constraint. The conﬁdence weight of each pixel sample

within the available neighboring blocks is inversely proportional

to the distance between the sample and the corrupted block.

The second one derives the AR coefﬁcients by minimizing the

summation of the weighted square errors within an extended

block in the previous frame along the motion trajectory under

the temporal continuity constraint. The conﬁdence weight of each

extended sample is inversely proportional to the distance toward

the corresponding motion aligned block whereas the conﬁdence

weight of each sample within the motion aligned block is set to be

one. The regression results generated by the two algorithms are

then merged to form the ultimate restorations. Various experi-

mental results demonstrate that the proposed error concealment

strategy is able to improve both the objective and subjective

quality of the replenished blocks compared to other methods.

Index Terms—Auto regressive model, conﬁdence weight, error

concealment, spatial continuity constraint, temporal continuity

constraint, video coding.

I. Introduction

STATE-OF-THE-ART video coding standard H.264/AVC

[1] signiﬁcantly outperforms the previous coding stan-

Manuscript received May 5, 2010; revised August 19, 2010 and November

18, 2010; accepted December 2, 2010. Date of publication March 17, 2011;

date of current version January 6, 2012. This work was supported by the

National Science Foundation of China, under Grant 60736043, the Joint Funds

of National Science Foundation of China, under Grant U0935001, and the

Major State Basic Research Development Program of China’s 973 Program,

under Grant 2009CB320905. This paper was recommended by Associate

Editor E. Steinbach.

Y. Zhang was with the Department of Computer Science, Harbin Insti-

tute of Technology, Harbin 150001, China. He is now with the Graduate

School at Shenzhen, Tsinghua University, Shenzhen 518055, China (e-mail:

ybzhang@mail.tsinghua.edu.cn).

X. Xiang and D. Zhao are with the Department of Computer Science, Harbin

Institute of Technology, Harbin 150001, China (e-mail: xgxiang@jdl.ac.cn;

dbzhao@jdl.ac.cn).

S. Ma and W. Gao are with the Institute of Digital Media, School of

Electronic Engineering and Computer Science, Peking University, Beijing

100871, China (e-mail: swma@jdl.ac.cn; wgao@pku.edu.cn).

Color versions of one or more of the ﬁgures in this paper are available

online at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TCSVT.2011.2130450

dards, such as MPEG-1 [2], H.262/MPEG-2 [3], and H.263

[4]. Although the highly efﬁcient redundancy removing tech-

niques in spatial and temporal domains leads to the success of

H.264/AVC, the highly compressed bit stream is susceptible

to transmission errors for error-prone networks. Consequently,

packet errors are unavoidable, which will severely degrade the

display quality at the decoder side.

Error resilience [5] and error concealment [6] are two major

techniques to combat the visual quality degradation caused

by noisy channels during transmission. Error resilience is

used to combat the transmission errors by adding redundant

information at the encoder with the penalty of decreasing the

compression efﬁciency. On the contrary, error concealment is a

post-processing technique which conceals the errors utilizing

the correctly received information at the decoder side with-

out modifying source and channel coding schemes. In this

paper, we mainly study the techniques of error concealment.

According to the information utilized, error concealment al-

gorithms can be categorized into spatial approaches, temporal

approaches and hybrid approaches that combine the former

two ones.

Spatial approaches reconstruct the corrupted macroblock

by utilizing the correctly decoded surrounding pixels under

smoothness constraint. Wang et al. proposed a spatial error

concealment method by minimizing the ﬁrst-order derivative-

based smoothness measure [7]. To suppress the induced blur-

ring artifacts, the second-order derivatives were considered

in [8]. Although such a smoothness constraint achieves good

results for the ﬂat regions, it may not be satisﬁed in the areas

with high frequency edges. To tackle this shortcoming, an

edge-preserving algorithm [9] was proposed to interpolate the

missing pixels. In [10], smooth and edge areas were efﬁciently

recovered based on selective directional interpolation. In [11],

an orientation adaptive interpolation scheme derived from the

pixel wise statistical model was proposed. In addition, a spatial

error concealment method based on a Markov random ﬁeld

(MRF) model was proposed in [12]. And in [13], a multiframe

spatial error concealment considering the error propagation

and incorporating the idea of least squares (LS) estimation

was proposed.

Spatial approaches may yield better performance than tem-

poral ones in scenes with high motion, or after a scene change

[14]. However, they may not restore the detail textures of

corrupted blocks [15]. In this case, the information from the

past frames (temporal approaches) may improve the quality of

corrupted blocks.

1051-8215/$26.00 c

2011 IEEE

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 13

Fig. 1. Proposed AR model based error concealment.

Temporal approaches restore the corrupted blocks by ex-

ploiting temporal correlation between successive frames. An

important issue in temporal approaches is to ﬁnd the most

suitable substitute blocks from the previous frames, i.e., se-

lecting the optimal motion vectors (MVs) for the corrupted

blocks. If the MV of the corrupted block is available at the

decoder, it can be utilized directly to motion-compensate the

corrupted block. However, when the MV is also lost, it has

to be re-estimated. Many pioneering works have been done

on recovering the corrupted MVs. Haskell and Messerschmitt

[16] took zero MV, the MV of the collocated block in

the reference frame, and the average or the median of the

MVs from the spatially adjacent blocks as candidate MVs

for the lost blocks. Chen et al. [17] proposed a side match

criterion taking advantage of the spatial contiguity and inter-

pixel correlation of image to select the best-ﬁt replacement

among the MVs of spatially contiguous candidate blocks. The

well known boundary matching algorithm (BMA) proposed

in [18] selected the MV that minimizes the total variation

between the internal boundary and the external boundary of the

reconstructed block as the optimal one to recover the corrupted

block. There are also some more sophisticated algorithms [12],

[13], [19]–[23] to obtain better replacements for the corrupted

blocks. For example, a means of estimating the missing MV

based on the use of MRF models [12], an algorithm using the

multiframe recovery principle and the boundary smoothness

property [13], a vector rational interpolation scheme [19], a

bilinear motion ﬁeld interpolation algorithm [20], a Lagrange

interpolation algorithm [21], and a dynamic programming

algorithm [22], [23] were proposed for error concealment. In

addition, some model aided error concealment algorithms were

also proposed. For instance, a projection of convex set (POCS)

based error concealment for packet video was proposed in

[24]. And in [25], a mixture of principal components was

proposed for error concealment.

Besides spatial and temporal approaches, hybrid approaches

combining the former two methods have been proposed re-

cently to obtain better replenishment results. For instance, in

temporal error concealment, the compensated block can be

further improved by spatial smoothing at its edges to make it

conform to the neighbors. In [26], the coding mode and block

loss patterns are clustered into four groups, and the weighting

between spatial and temporal smoothness constraints depends

on the group. In [27], a priority-driven region matching

algorithm to exploit the spatial and temporal information was

proposed. And in [28], a spatio-temporal boundary matching

algorithm (STBMA) and partial differential equation (PDE)

were proposed.

Fig. 2. Auto-regressive model.

Fig. 3. Spatial continuity constraint.

The most aforementioned error concealment algorithms usu-

ally interpolate the previous frames into half or quarter-pel ac-

curacy before deriving the best MVs for the corrupted blocks,

due to the fact that the motion of objects between adjacent

frames may be of fractional-pel accuracy. The interpolation

ﬁlters used are usually separable and the coefﬁcients are

ﬁxed. Such methods achieve good performance for isotropic

regions; however, they may result in poor performance for

anisotropic local image structures. To inhibit the inferiority of

the separable and ﬁxed interpolation ﬁlters, an auto-regressive

(AR) model based error concealment is proposed in this paper.

It is well known that AR has long been employed to model

regular stationary random process [29]. For such a process,

its statistical properties have been well studied. For example,

Kokaram et al. used AR to detect and interpolate “dirt”

areas [30], [31]. Efstratiadis and Katsaggelos employed AR

to perform motion estimation [32]. Li developed a backward

adaptive video encoder exploiting the prediction property of

AR model [33].

In our formulation, each pixel within the corrupted block

is replenished as the weighted summation of pixels within a

square, which is centered at the pixel indicated by the MV

14 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

Fig. 4. Probabilistic conﬁdence magnitude within neighboring blocks.

Fig. 5. Temporal continuity constraint.

Fig. 6. Probabilistic conﬁdence magnitude within an extended 4 ×4 block.

with integer-pel accuracy in a regression manner. Two block-

dependent AR coefﬁcient derivation algorithms are proposed

to achieve better performance. The ﬁrst one is the coefﬁcient

derivation algorithm under the spatial continuity constraint,

in which the summation of the weighted square errors within

the available neighboring blocks is minimized. The conﬁdence

weight of each sample within the available neighboring blocks

is inversely proportional to the distance between the sample

and the corrupted block. The second coefﬁcient derivation al-

gorithm is under the temporal continuity constraint, where the

summation of the weighted square errors within an extended

block in the previous frame along the motion trajectory is

minimized. The conﬁdence weight of the extended sample is

inversely proportional to the distance toward the corresponding

motion aligned block whereas the conﬁdence weight of each

sample within the motion aligned block is set to be one.

The interpolations generated by the weights derived under

these two constraints are then merged to form the ultimate

concealing results.

The proposed AR model based error concealment scheme

is the extension of our previous works [34], [35]. In [34],

only the spatial continuity constraint is applied and equal

conﬁdence weight is assigned for each training pixel sample.

In [35], both spatial and temporal continuity constraints are

applied; however, the experimental results and discussions are

not enough. For example, in [35], the experimental result is

only compared with BMA and our previous work in [34], and

the probability conﬁdence effects are not fully discussed. In

addition, the merging operation is just simply averaging the

results obtained by spatial and temporal continuity constraints

in [35], whereas the merging depends on the estimated MV

in this paper. Actually, the proposed AR model based error

concealment scheme can be considered as a post-processing

for any MV recovery scheme (e.g., BMA, the methods in

[12] and [13] and STBMA) by adaptively adjusting the AR

coefﬁcients according to the local image properties. Our goal

is to obtain appropriate AR coefﬁcients, whereas other inter

frame error concealments (e.g., BMA, the methods in [12] and

[13] and STBMA) are aimed at generating more accurate MVs

by certain criterions. Various experimental results demonstrate

that the proposed error concealment strategy is able to not

only increase the peak signal-to-noise ratio (PSNR) but also

improve the visual quality of concealing blocks compared to

other methods.

The remainder of this paper is organized as follows.

Section II describes the AR model based error concealment

scheme. Sections III and IV present the coefﬁcient derivations

under the spatial and temporal continuity constraints respec-

tively. Experimental results and analysis conducted on various

sequences are given in Section V. Finally, a brief conclusion

is provided in Section VI.

II. Auto-Regressive Model-Based Error

Concealment

The proposed AR model based error concealment scheme is

illustrated in Fig. 1. For each corrupted block, the correspond-

ing MV is ﬁrst derived by any kind of recovery algorithms

(such as BMA and STBMA). The AR model is then applied

to the corrupted block along the derived motion trajectory. To

improve the quality of concealed frames, two AR coefﬁcient

derivation algorithms under the spatial continuity and temporal

continuity constraints are performed respectively, utilizing the

weighted LS algorithm. The interpolation results generated

by the two sets of coefﬁcients are then merged to form the

ultimate restorations.

Fig. 2 illustrates the AR model employed by the proposed

error concealment. It is noted that the AR model is applied

along the motion trajectory. For each corrupted pixel, the

corresponding pixel along the motion trajectory with integer-

pel accuracy in the previous reconstructed frame is ﬁrst

found, and then all the pixels within a square centered at

the corresponding motion aligned pixel are combined in a

linear regression form. The linear regression can be expressed

ˆxt(i, j)=



k=−R



l=−R

α(k, l)xt−1(i+dy +k, j +dx +l)(1)

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 15

where ˆxt(i, j)represents the corrupted pixel located at (i, j )

within the current frame Xt,Rrepresents the range of the AR

model, (dx, dy)represents the estimated MV with the integer-

pel accuracy, xt−1(i, j )represents the pixel within the previous

reconstructed frame Xt−1, and α(k, l)represents the desired

coefﬁcients.

The main merit of the proposed AR model based error con-

cealment, compared with other motion compensated schemes,

is that it is able to adapt spatially to local orientation structure.

In traditional motion compensated error concealment algo-

rithms, the corrupted block is replaced by the corresponding

block indicated by the estimated MV in the previous frames.

The best MV is usually found by minimizing the matching

errors between the neighboring blocks and the candidate

ones in the fractional interpolated version of the previous

frames. Such methods achieve good performance for isotropic

local regions; however, inferior results may be perceived for

anisotropic local image structures, since interpolation ﬁlters

are separable and ﬁxed along vertical and horizontal directions.

In contrast, in the proposed AR model, the interpolation is

non-separable and can be along arbitrary direction. Besides the

interpolation coefﬁcients can be varied from one local region

to the others. This results in strong preservation of details in

the restored image and greatly improves the performance of

error concealment.

Deﬁne k,l,R as an operator that extracts a patch of a

ﬁxed size (centered at (k, l)and with (2R+1

)×(2R+1

)

pixels) from an image, the expression k,l,RXt−1(Xt−1is

represented as a vector by lexicographic ordering) results

with a vector of length (2R+1

)2being the extracted patch.

Consequently, the linear regression in (1) can also be expressed

ˆxt(i, j)=i+dy,j+dx,R Xt−1αT(2)

where αrepresents the coefﬁcient vector of the AR model

and (dy, dx)represents the MV with integer-pel accuracy. The

summed square error between the corrupted and the actual

pixels is

ε2=

N−1



i=0

N−1



j=0

(xt(i, j)−ˆxt(i, j))2

N−1



i=0

N−1



j=0 xt(i, j)−i+dy,j+dx,RXt−1αT2

(3)

where Nrepresents the width and height of the corrupted

block. To minimize ε2, the ﬁrst derivative of ε2to αshould

be zero according to the LS algorithm, that is

∂ε2

∂α =

N−1



i=0

N−1



j=0 i+dy,j+dx,RXt−1Ti+dy,j+dx,R Xt−1αT

−

N−1



i=0

N−1



j=0

xt(i, j)i+dy,j+dx,RXt−1T=0.

(4)

By solving the above equation, we get the optimal coefﬁ-

cients as

αT=N−1



i=0

N−1



j=0 i+dy,j+dx,RXt−1Ti+dy,j+dx,R Xt−1−1

N−1



i=0

N−1



j=0

xt(i, j)i+dy,j+dx,RXt−1T.

(5)

However, since the actual pixel xt(i, j)is not available at the

decoder side, we cannot directly obtain the AR coefﬁcients

according to (5). Instead, we have to estimate the AR co-

efﬁcients by exploring the spatial and temporal correlations

of the corrupted block with its available spatial and temporal

neighboring pixels.

III. AR Coefficient Derivation Under Spatial

Continuity Constraint

Pixels within adjacent blocks have a high possibility of

belonging to the same object, which can be reﬂected by

the phenomenon that adjacent blocks possess similar motion

trends. Such a property is termed as spatial continuity con-

straint in this paper, based on which a set of AR coefﬁcients

for the corrupted block can be derived. It is stated that

AR coefﬁcients can reﬂect the MV of each block to some

extent [33] and due to the piecewise stationary characteristics

of natural image [36], we assume all the pixels within the

corrupted block possess the same AR coefﬁcients, just like all

the pixels within the corrupted block have the same MV in the

traditional motion compensated error concealment method. If

we use AR model to represent the motion between successive

frames, spatial continuity constraint can be interpreted as that

all the pixels within the available neighboring blocks have the

same AR coefﬁcients as those within the corrupted block in

this paper.

As shown in Fig. 3, under spatial continuity constraint each

pixel within the corrupted block and its neighboring blocks can

be regressed by the corresponding pixels within the previous

reconstructed frame utilizing the same AR coefﬁcients. Let Bt

be a neighboring block of the current block within the current

frame, i.e., Bt⊂Xt. In addition, let bt(m, n)be an arbitrary

pixel in Bt, i.e., bt(m, n)∈Bt.bt(m, n)can be represented

by the regression function of Xt−1and αas

bt(m, n)=m+dy,n+dx,RXt−1αT(6)

where αrepresents the AR coefﬁcients. According to (5), the

solution of αcan be computed by the LS method.

It is noted that during the coefﬁcient derivation process,

different training samples should be assigned different prob-

abilistic conﬁdences so as to achieve better performance. For

example, pixels that are closer to the corrupted block or

with similar texture should be assigned larger probabilistic

conﬁdences. Deﬁne the corresponding probabilistic conﬁdence

of bt(m, n)under the spatial continuity constraint is wα(m, n),

with 0 ≤wα(m, n)≤1 and (m,n)∈Btwα(m, n)= 1, the

16 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

optimal αunder probabilistic conﬁdences should be

α= arg min

α

(m,n)∈Bt

bt(m, n)−ˆ

bt(m, n)wα(m, n)

2.(7)

Since the correlation between pixels decreases with the in-

crease of their distance, wα(m, n)is set to be inversely pro-

portional to the distance between bt(m, n)and the corrupted

block, that is

wα(m, n)=1

⎧

⎪

⎨

⎪

⎩

N−m,if bt(m, n)∈upper block

N−n,if bt(m, n)∈left block

m+1 ,if bt(m, n)∈lower block

n+1 ,if bt(m, n)∈right block

0, otherwise

(8)

with S=SL+SR+SA+SB, where

SL=N−1

n=0

N−n,if left block is available

0,otherwise

SR=N−1

n=0

n+1 ,if right block is available

0,otherwise

SA=N−1

m=0

N−m,if upper block is available

0,otherwise

SB=N−1

m=0

m+1 ,if lower block is available

0,otherwise.

Here Nrepresents the width and height of the corrupted

block.

Fig. 4 graphically shows the probabilistic conﬁdence mag-

nitudes within a 4 ×4 block given by (8) as an example.

The white block represents the corrupted block, which is

surrounded by its four neighboring blocks. Each neighboring

block is composed of 15 pixels whose gray value is inverse

proportional to the magnitude wα(i, j)of the sixteen samples.

It can be observed that much larger probabilistic conﬁdence

values are assigned for the pixels closer to the corrupted block

than those for the pixels farther toward the corrupted block.

It is noted that Figs. 3 and 4 exhibit a universal case,

where the four neighboring blocks are all available to train

AR coefﬁcients. Actually, there are two cases. In the ﬁrst

case, if any of the neighboring blocks are correctly received,

the correctly received neighboring blocks are utilized to train

AR coefﬁcients of the corrupted block. In the second case,

if all the neighboring blocks are lost, the already concealed

neighboring blocks are utilized to train AR coefﬁcients of the

corrupted block.

By setting the ﬁrst derivative of the weighted errors in (7)

to zero, the AR coefﬁcients under spatial continuity constraint

are computed as

αT=CL

P+CR

P+CA

P+CB

P−1DL

P+DR

P+DA

P+DB

P(9)

where

P=⎧

⎪

⎨

⎪

⎩

N−1



m=0

N−1



n=0 

Pα(m, n)∗CT

if left block is available

0,otherwise

P=⎧

⎪

⎨

⎪

⎩N−1

m=0 N−1

n=0 

Pα(m, n)∗CT

if right block is available

0,otherwise

P=⎧

⎪

⎨

⎪

⎩N−1

m=0 N−1

n=0 

Pα(m, n)∗CT

if upper block is available

0,otherwise

P=⎧

⎪

⎨

⎪

⎩N−1

m=0 N−1

n=0 

Pα(m, n)∗CT

if lower block is available

0,otherwise

P=⎧

⎨

⎩N−1

m=0 N−1

n=0 wα(m, n)xt(m, n)CT,

if left block is available

0,otherwise

P=⎧

⎪

⎨

⎪

⎩

N−1

m=0 N−1

n=0 wα(m, n)xt(m, n)CT,

if right block is available

0,otherwise

P=⎧

⎨

⎩N−1

m=0 N−1

n=0 wα(m, n)xt(m, n)CT,

if upper block is available

0,otherwise

P=⎧

⎨

⎩N−1

m=0 N−1

n=0 wα(m, n)xt(m, n)CT,

if lower block is available

0,otherwise

with 

Pα(m, n)=[wα(m, n),w

α(m, n), .., wα(m, n)]

 

(2R+1)2

and C=

m+dy,n+dx,RXt−1.

The operator “∗” represents element by element multiplica-

tion of two vectors. With the obtained AR coefﬁcient α, the

corrupted block is restored according to (2).

IV. AR Coefficient Derivation Under Temporal

Continuity

Besides spatial continuity constraint, video sequence also

has temporal continuity constraint, which can be proved by

the observation that the same object among adjacent frames

is usually threaded by the same motion trajectory. Similar to

spatial continuity constraint, we assume all the pixels within

the corrupted block possess the same AR coefﬁcients. The

temporal continuity constraint in this paper can be interpreted

as that all the pixels within the corrupted block have the

same AR coefﬁcients as those within the corresponding motion

aligned block in the previous frame. Utilizing temporal conti-

nuity constraint, we can derive another set of AR coefﬁcients,

which is shown as Fig. 5. It is noted that we extend the

motion aligned block, shown as the gray pixels as well as the

pixels surrounding them in the previous frame Xt−1in Fig. 5,

to ﬁnd sufﬁcient training samples for the derivation of AR

coefﬁcients. Deﬁne Et−1to be the extended motion aligned

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 17

TABLE I

Average PSNR Results of Each Test Sequence With and Without the Proposed Probabilistic Confidence (PLR = 10%)

PSNR (dB)

BMA+AR

Sequence QP Spatial Temporal Combined

Uniform Weight Proposed Weight Uniform Weight Proposed Weight Uniform Weight Proposed Weight

Mobile 24 30.86 30.96 31.17 31.46 31.55 31.62

28 29.54 29.56 30.04 29.99 30.04 29.95

QCIF Paris 24 29.84 30.00 30.59 30.72 30.85 31.04

28 29.56 29.74 30.21 30.27 30.54 30.72

Suzie 24 35.29 35.65 35.48 35.51 35.83 36.01

28 34.06 34.19 34.03 34.07 34.38 34.51

Average 31.53 31.68 31.92 32.00 32.20 32.31

Foreman 24 31.49 31.63 31.66 31.70 32.24 32.33

28 30.86 30.99 30.86 30.86 31.47 31.58

CIF Mobile 24 28.37 28.48 28.51 28.63 28.86 29.05

28 27.74 27.76 27.98 28.02 28.26 28.28

Flower 24 28.37 28.43 27.97 28.08 28.57 28.69

28 27.59 27.60 27.33 27.47 27.84 27.94

Average 29.07 29.15 29.05 29.13 29.54 29.65

STBMA+AR

Sequence QP Spatial Temporal Combined

Uniform Weight Proposed Weight Uniform Weight Proposed Weight Uniform Weight Proposed Weight

Mobile 24 30.62 30.78 31.53 31.56 31.25 31.39

28 29.36 29.44 29.67 29.64 29.67 29.75

QCIF Paris 24 30.77 31.00 31.73 31.89 31.68 31.90

28 29.90 30.05 30.80 30.98 30.71 30.83

Suzie 24 35.34 35.53 35.39 35.44 35.83 36.04

28 34.04 34.19 34.11 34.10 34.34 34.47

Average 31.67 31.83 32.21 32.27 32.25 32.40

Foreman 24 31.12 31.30 31.37 31.39 31.73 31.83

28 30.63 30.89 30.90 30.91 31.19 31.30

CIF Mobile 24 28.48 28.59 28.97 29.06 28.88 29.07

28 27.85 27.91 28.35 28.46 28.28 28.41

Flower 24 29.13 29.25 28.56 28.73 29.21 29.30

28 28.09 28.18 27.57 27.76 28.07 28.24

Average 29.22 29.35 29.29 29.39 29.56 29.69

Fig. 7. PSNR performance comparison versus the frame number while the PLR is 10% (each slice contains one row of MBs). (a) Mobile (CIF). (b) Flower

(CIF).

18 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

TABLE II

Average PSNR Results of Each QCIF Test Sequence Using Different Methods

PLR = 5%, PSNR (dB)

Sequence QP BMA [12] STBMA STBMA +OBMC STBMA+PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 35.85 34.92 36.06 36.15 36.08 36.01 36.52 36.42 36.13 36.48 36.51

Football 24 33.67 33.45 33.82 33.88 33.89 33.96 33.85 33.94 34.00 33.85 34.02

28 32.27 32.35 32.57 32.64 32.60 32.32 32.59 32.42 32.54 32.64 32.69

40 27.06 26.72 26.92 26.95 26.93 27.11 27.07 27.14 26.99 26.97 26.99

16 38.33 38.52 39.51 39.54 39.53 39.96 39.59 40.12 40.35 40.54 40.57

Mobile 24 34.92 35.05 35.46 35.48 35.49 35.69 35.67 35.85 35.67 35.85 35.84

28 32.41 32.37 32.68 32.70 32.69 32.74 32.78 32.79 32.76 32.83 32.82

40 24.26 24.23 24.27 24.27 24.27 24.28 24.28 24.28 24.28 24.28 24.29

16 38.29 38.39 38.95 38.99 38.95 39.01 39.86 40.72 39.48 41.46 41.42

Paris 24 35.27 35.74 36.05 36.10 36.06 36.17 36.65 36.63 36.25 36.66 36.75

28 33.31 32.84 33.33 33.36 33.33 33.64 34.00 34.16 33.62 34.04 34.12

40 25.36 25.45 25.36 25.37 25.36 25.38 25.60 25.59 25.39 25.60 25.60

16 43.26 43.28 44.07 44.11 44.08 44.20 44.03 44.17 44.24 44.10 44.26

Suzie 24 38.99 38.81 39.16 39.18 39.17 39.23 39.15 39.27 39.36 39.25 39.28

28 36.72 36.81 36.94 36.96 36.95 37.00 37.04 37.07 37.02 37.07 37.06

40 30.58 30.53 30.57 30.58 30.59 30.58 30.60 30.61 30.59 30.61 30.61

Average 33.78 33.72 34.11 34.14 34.12 34.21 34.33 34.45 34.29 34.51 34.55

PLR = 10%, PSNR (dB)

Sequence QP BMA [12] STBMA STBMA +OBMC STBMA +PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 25.45 24.95 25.62 25.78 25.65 26.03 25.74 26.15 26.02 25.57 26.14

Football 24 24.88 24.66 25.20 25.16 25.20 25.46 24.80 25.47 25.43 24.66 25.53

28 24.28 23.82 23.97 24.40 23.97 24.94 24.36 25.04 24.49 24.23 24.54

40 22.91 22.59 23.37 23.53 23.36 23.60 23.06 23.74 23.59 22.62 23.62

16 30.25 30.82 32.14 32.21 32.18 32.48 33.00 33.32 32.23 33.03 33.05

Mobile 24 29.18 29.69 30.65 30.67 30.69 30.96 31.46 31.62 30.78 31.56 31.39

28 28.44 28.72 29.46 29.47 29.47 29.56 29.99 29.95 29.44 29.64 29.75

40 23.27 23.33 23.47 23.50 23.48 23.55 23.56 23.57 23.53 23.53 23.54

Paris 16 30.43 31.16 32.31 32.36 32.32 31.77 32.87 32.93 32.17 33.53 33.56

24 28.95 29.71 30.76 30.85 30.78 30.00 30.72 31.04 31.00 31.89 31.90

28 28.53 29.16 29.98 30.03 29.99 29.74 30.27 30.72 30.05 30.98 30.83

40 23.91 24.17 24.55 24.62 24.57 24.44 24.67 24.71 24.52 24.80 24.78

16 36.68 36.30 37.51 37.62 37.54 37.66 37.28 37.95 37.91 37.56 37.99

Suzie 24 35.25 34.57 35.53 35.56 35.54 35.65 35.51 36.01 35.53 35.44 36.04

28 33.75 33.38 34.16 34.21 34.18 34.19 34.07 34.51 34.19 34.10 34.47

40 29.66 29.49 29.62 29.64 29.63 29.53 29.45 29.63 29.50 29.42 29.59

Average 28.49 28.53 29.27 29.35 29.28 29.35 29.43 29.78 29.40 29.54 29.80

PLR = 20%, PSNR (dB)

Sequence QP BMA [12] STBMA STBMA +OBMC STBMA +PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 23.22 22.90 23.07 23.11 23.10 23.96 23.46 23.90 23.90 23.70 23.92

Football 24 23.13 22.81 23.00 23.13 23.03 23.71 23.27 23.73 23.45 23.17 23.61

28 22.60 21.92 22.35 22.46 22.36 23.14 22.91 23.14 22.75 22.75 22.81

40 22.16 21.73 21.98 22.12 21.99 22.24 22.25 22.63 22.00 22.10 22.32

16 27.19 27.72 29.27 29.27 29.28 29.58 29.61 30.34 29.80 30.06 30.28

Mobile 24 26.54 27.24 28.28 28.40 28.35 28.66 28.58 29.06 28.83 28.91 29.23

28 26.09 26.63 27.35 27.52 27.46 27.76 27.60 28.21 27.73 27.79 28.01

40 22.37 22.37 22.82 22.88 22.83 22.98 22.81 22.91 22.99 22.82 22.94

16 29.36 29.71 30.76 30.91 30.77 30.78 31.81 32.26 31.18 32.15 32.31

Paris 24 28.11 28.30 29.53 29.61 29.47 29.50 30.40 30.60 29.91 30.98 31.04

28 27.49 27.83 28.51 28.61 28.52 28.58 29.30 29.72 28.80 29.77 29.81

40 23.37 23.52 23.83 23.92 23.85 23.31 24.19 23.92 23.37 24.36 23.92

16 34.99 34.98 36.14 36.17 36.16 36.57 34.86 36.31 36.64 35.14 36.41

Suzie 24 33.78 33.70 34.53 34.74 34.54 34.82 33.73 34.80 34.86 33.72 34.79

28 33.05 32.81 33.42 33.48 33.44 33.67 32.61 33.59 33.71 32.60 33.57

40 29.05 28.74 29.02 29.09 29.04 29.00 28.73 28.89 28.98 28.64 28.80

Average 27.03 27.06 27.74 27.84 27.76 28.02 27.88 28.38 28.06 28.04 28.36

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 19

TABLE III

Average PSNR Results of Each CIF Test Sequence Using Different Methods

PLR = 5%, PSNR (dB)

Sequence QP BMA [12] STBMA STBMA +OBMC STBMA+PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 38.40 38.59 39.89 39.99 40.11 40.21 39.97 40.24 40.24 40.25 40.29

Foreman 24 36.40 36.33 37.05 37.04 37.10 37.17 37.13 37.15 37.17 37.26 37.30

28 34.95 34.79 35.38 35.39 35.40 35.50 35.31 35.56 35.50 35.25 35.45

40 29.74 29.73 29.93 29.93 29.95 29.95 29.99 30.07 29.95 30.00 30.08

16 34.93 34.70 38.04 38.10 38.11 38.39 38.04 38.79 38.65 38.57 38.97

Mobile 24 32.69 32.93 34.48 34.52 34.59 34.81 34.78 34.99 35.04 35.01 35.25

28 31.30 31.35 32.62 32.58 32.65 32.81 32.78 32.98 32.92 32.96 32.97

40 24.71 24.72 25.06 25.06 25.06 25.07 25.04 25.10 25.09 25.11 25.13

16 36.46 35.88 37.46 37.61 37.49 38.71 38.24 38.69 39.11 38.75 39.11

Flower 24 34.34 33.95 35.10 35.21 25.15 35.84 35.43 25.86 36.13 35.82 36.13

28 32.58 32.12 32.98 33.06 33.01 33.57 33.37 33.62 33.55 33.49 33.60

40 24.88 24.82 24.92 24.94 24.93 25.00 25.01 25.03 24.97 25.00 24.98

Average 32.62 32.49 33.58 33.62 32.80 33.92 33.76 33.18 34.03 33.96 34.11

PLR = 10%, PSNR (dB)

Sequence QP BMA [12] STBMA STBMA +OBMC STBMA+PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 31.33 30.94 31.80 31.82 31.90 32.70 32.53 33.44 32.44 31.51 32.45

Foreman 24 30.45 29.90 31.03 31.05 31.14 31.63 31.70 32.33 31.30 31.39 31.83

28 29.70 29.47 30.56 30.59 30.67 30.99 30.86 31.58 30.89 30.91 31.30

40 26.64 26.57 27.36 27.38 27.42 27.44 27.56 27.89 27.44 27.55 27.83

16 26.59 26.34 28.84 28.98 28.96 29.22 29.45 29.85 29.50 29.93 29.89

Mobile 24 25.95 25.63 28.03 28.15 28.13 28.48 28.63 29.05 28.59 29.06 29.07

28 25.46 25.19 27.53 27.65 27.59 27.76 28.02 28.28 27.91 28.46 28.41

40 22.11 22.28 23.29 23.36 23.31 23.48 23.47 23.63 23.47 23.52 23.58

16 26.66 26.14 28.10 28.35 28.21 29.04 28.73 29.22 29.83 29.12 29.90

Flower 24 26.33 25.85 28.02 28.06 28.05 28.43 28.08 28.69 29.25 28.73 29.30

28 25.78 25.27 26.89 27.13 27.04 27.60 27.47 27.94 28.18 27.76 28.24

40 22.86 22.19 23.14 23.22 23.24 23.54 23.47 23.62 23.54 23.38 23.53

Average 26.66 26.31 27.88 27.98 27.97 28.36 28.33 28.79 28.53 28.44 28.78

PLR = 20%, PSNR (dB)

Sequence QP BMA [12] STBMA STBMA +OBMC STBMA+PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 29.13 28.90 30.21 30.22 30.25 30.60 29.63 30.91 30.48 29.64 30.66

Foreman 24 28.79 28.38 29.57 29.60 29.63 29.83 29.28 30.27 29.91 29.04 29.92

28 28.44 27.87 29.27 29.26 29.32 29.25 29.14 29.76 29.43 29.02 29.79

40 25.83 25.72 26.55 26.62 26.64 26.61 26.39 26.96 26.63 26.18 26.86

16 24.39 24.22 26.84 26.95 26.87 27.13 27.11 27.72 27.51 27.65 27.93

Mobile 24 23.88 23.67 26.28 26.35 26.33 26.34 26.32 26.81 26.78 26.91 27.18

28 23.44 23.28 25.65 25.73 25.68 25.70 25.70 26.32 26.01 26.30 26.50

40 20.93 21.24 22.40 22.49 22.44 22.24 22.25 22.55 22.35 22.42 22.52

16 24.66 23.88 26.03 26.28 26.13 26.76 26.26 26.81 27.40 26.70 27.46

Flower 24 24.30 23.64 25.70 25.91 25.79 26.19 25.71 26.30 26.97 26.19 27.05

28 23.84 23.25 24.91 25.18 24.94 25.64 25.28 25.85 26.12 25.65 26.27

40 21.82 21.14 22.27 22.57 22.29 22.64 22.46 22.65 22.92 22.59 22.97

Average 24.95 24.60 26.31 26.43 26.36 26.58 26.29 26.91 26.88 26.52 27.09

block in the closest previous frame Xt−1, i.e., Et−1⊂Xt−1.

And deﬁne et−1(k, l)to be an arbitrary pixel within Et−1, i.e.,

et−1(k, l)∈Et−1. As shown in Fig. 5, for each corrupted pixel

xt(k, l), the corresponding motion aligned pixel et−1(k, l)in

the extended block is ﬁrst found, and then the corresponding

pixel xt−2(k, l)in the second closest reconstructed frame is

also found by the same MV. Apparently, et−1(k, l)can be

regressed by the pixels within a square neighborhood which

is centered at xt−2(k, l)as

ˆet−1(k, l)=k+dy,l+dx,RXt−2βT(10)

where βrepresents the AR coefﬁcients derived under the

temporal continuity constraint. The derived coefﬁcient βis

then utilized to restore the corrupted pixel xt(k, l). Apparently,

the solution of βshould be the one that satisfy

β= arg min

β

(k,l)∈Et−1

(et−1(k, l)−ˆet−1(k, l))2.(11)

However, this imposes the same probabilistic conﬁdence on

each training sample, which will limit the accuracy of the

derived AR coefﬁcients. To tackle such a problem, we assigned

20 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

Fig. 8. Error concealment results of the eighth frame over Foreman (CIF) at the PLR of 10%. (a) Original image. (b) Corrupted image. (c) Concealed

image using BMA (34.221 dB). (d) Concealed image using STBMA (35.560 dB). (e) Concealed image using STBMA+PDE (35.568dB). (f) Concealed image

using STBMA+OBMC (35.569 dB). (g) Concealed image using the proposed AR model under spatial continuity constraint (35.929 dB). (h) Concealed image

using the proposed AR model under temporal continuity constraint (36.292 dB). (i) Concealed image using the proposed AR model by combining spatial and

temporal continuity constraints (36.308 dB).

appropriate probabilistic conﬁdence for each sample within the

extended block. That is to say, the optimal βshould be

β= arg min

β

(k,l)∈Et−1

(et−1(k, l)−ˆet−1(k, l)) wβ(k, l)

2(12)

where wβ(k, l)represents the probabilistic conﬁdence of each

training sample et−1(k, l). For the samples located within

the corresponding motion aligned block, the probabilistic

conﬁdence is set to be one; and for the samples located at

the extended regions, the probabilistic conﬁdence is deﬁned

to be inversely proportional to the distance toward the center

of the extended block. To be more speciﬁc, the probabilistic

conﬁdence of each sample can be formulated as (13) at the

bottom of the next page.

Here Mrepresents the extended range and Nrepresents the

width and height of the corrupted block, respectively.

Fig. 6 depicts the probabilistic conﬁdence magnitudes

within an extended 4 ×4 block as an example. It is noted

that M= 3, and N= 4 in Fig. 6. The sixteen black pixels

correspond to the motion aligned block, and all the remaining

gray pixels correspond to the extended region. The gray value

is inverse proportional to the probabilistic conﬁdence of the

corresponding sample.

According to the weighted LS, the closed-form solution of

βshould be

βT=⎡

⎣

(k,l)∈Et−1

Pβ(k, l)∗k+dy,l+dx,RXt−2T

k+dy,l+dx,RXt−2−1

⎡

⎣

(k,l)∈Et−1

et−1(k, l)k+dy,l+dx,RXt−2Twβ(k, l)⎤

⎦(14)

where 

Pβ(k, l)=wβ(k, l),w

β(k, l), .., wβ(k, l)

 

(2R+1)2

, and the

operator “∗” represents element by element multiplication of

two vectors.

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 21

Fig. 9. Error concealment results of the 50th frame over Mobile (CIF) at the PLR of 20%. (a) Original image. (b) Corrupted image. (c) Concealed image

using BMA (29.627 dB). (d) Concealed image using STBMA (29.985 dB). (e) Concealed image using STBMA+PDE (29.997 dB). (f) Concealed image using

STBMA+OBMC (30.066 dB). (g) Concealed image using the proposed AR model under spatial continuity constraint (31.475 dB). (h) Concealed image using

the proposed AR model under temporal continuity constraint (32.516 dB). (i) Concealed image using the proposed AR model by combining spatial and

temporal continuity constraints (32.610 dB).

wβ(k, l)=1

⎧

⎪

⎨

⎪

⎩

1, M ≤k,l<M+N

1max k−M+N2−N2+1,l−M+N2−N2+1

,0≤k<M+N, 0≤l<M+N

1max k−M+N2−N2+1,l−M+N2−1−N2+1

,0≤k<M+N, M +N2≤l<M+N

1max k−M+N2−1−N2+1,l−M+N2−N2+1

,M+N2≤k<M+N, 0≤l<M+N

1max k−M+N2−1−N2+1,l−M+N2−1−N2+1

,M+N2≤k<M+N, M +N2≤l<M+N

(13)

with

S=N2+

0≤k<M+N

0≤l<M+N

1max k−M+N2−N2+1,l−M+N2−N2+1

+

0≤k<M+N

M+N2≤l<M+N

1max k−M+N2−N2+1,l−M+N2−1−N2+1



+

M+N2≤k<M+N

0≤l<M+N

1max k−M+N2−1−N2+1,l−M+N2−N2+1

+

M+N2≤k<M+N

M+N2≤l<M+N

1max k−M+N2−1−N2+1,l−M+N2−1−N2+1

.

22 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

TABLE IV

Average PSNR Results of Each Test Sequence Excluding Error Propagation

Sequence QP PSNR (dB)

BMA [12] STBMA STBMA+OBMC STBMA+PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 38.15 37.50 38.29 38.39 38.33 38.49 38.20 38.54 38.49 38.25 38.56

Football 24 34.49 34.02 34.64 34.72 34.71 34.69 34.59 34.76 34.80 34.67 34.89

28 32.59 32.25 32.78 32.84 32.81 32.82 32.76 32.88 32.93 32.87 33.00

QCIF 40 27.06 26.94 27.11 27.13 27.13 27.12 27.11 27.14 27.16 27.15 27.18

16 40.29 40.08 40.94 40.98 40.97 41.08 41.37 41.33 41.30 41.60 41.61

Mobile 24 35.27 34.99 35.67 35.71 35.70 35.73 35.88 35.87 35.81 35.97 35.96

28 32.58 32.41 32.78 32.79 32.79 32.84 32.91 32.91 32.87 32.96 32.95

40 24.50 24.50 24.53 24.53 24.53 24.54 24.54 24.55 24.54 24.54 24.55

16 41.01 41.32 41.85 41.92 41.87 41.70 42.29 42.20 42.12 42.66 42.58

Paris 24 36.40 36.50 36.95 36.99 36.98 36.76 37.02 36.98 37.05 37.32 37.29

28 33.88 33.99 34.19 34.22 34.20 34.07 34.29 34.26 34.27 34.46 34.43

40 25.69 25.70 25.74 25.75 25.75 25.74 25.79 25.77 25.74 25.79 25.79

16 43.68 43.60 44.11 44.15 44.13 44.22 44.27 44.42 44.38 44.28 44.46

Suzie 24 39.21 39.22 39.52 39.55 39.53 39.49 39.46 39.57 39.62 39.54 39.66

28 37.10 37.07 37.28 37.29 37.28 37.30 37.28 37.37 37.35 37.31 37.40

40 31.00 30.93 31.02 31.02 31.03 31.00 31.01 31.02 31.01 31.01 31.02

Average 34.56 34.44 34.84 34.87 34.86 34.85 34.92 34.97 34.97 35.02 35.08

16 43.53 43.38 43.92 43.92 43.92 44.01 43.86 44.03 44.03 43.86 44.05

Foreman 24 38.69 38.51 38.90 38.91 38.96 38.96 38.90 38.99 38.95 38.89 38.95

28 36.61 36.52 36.76 36.77 36.77 36.78 36.78 36.83 36.80 36.76 36.83

40 30.31 30.29 30.33 30.33 30.33 30.34 30.36 30.37 30.34 30.36 30.37

16 41.70 41.56 42.61 42.63 42.63 42.74 42.72 42.73 42.77 42.98 42.99

CIF Mobile 24 36.32 36.12 36.91 36.93 36.92 36.89 37.05 37.06 37.02 37.15 37.16

28 33.59 33.46 33.99 34.00 34.00 34.02 34.12 34.14 34.07 34.16 34.17

40 25.22 25.22 25.29 25.30 25.30 25.30 25.32 25.34 25.30 25.32 25.32

16 42.86 42.73 43.54 43.59 43.57 43.56 43.60 43.64 43.86 43.85 43.91

Flower 24 37.34 37.22 37.86 37.89 37.88 37.86 37.89 37.92 38.06 38.05 38.10

28 34.54 34.38 34.89 34.91 34.90 34.92 34.92 34.96 35.02 35.02 35.05

40 25.64 25.61 25.72 25.72 25.72 25.72 25.73 25.74 25.74 25.74 25.75

Average 35.53 35.42 35.89 35.91 35.91 35.93 35.94 35.98 36.00 36.01 36.05

After having obtained the AR coefﬁcients αand β,we

merge the two regression results as

ˆxt(i, j)=τ•i+dy,j+dx,RXt−1αT+(1−τ)•i+dy,j+dx,R Xt−1βT

(15)

where τis the merging factor, and it is computed as

τ=⎧

⎪

⎨

⎪

⎩

1,if max (abs (mv[0]), abs (mv[1])) ≥16

0.5,if max (abs (mv[0]), abs (mv[1])) =0

max (abs (mv[0]), abs (mv[1])) 16,otherwise.

(16)

Here mv[0] and mv[1] represent the horizontal and vertical

components of the MV for the corrupted block selected by

BMA or STBMA with quarter-pel accuracy.

It is noted that we will pad the corresponding pixels outside

the boundary when the training area (in the current and/or

reference frame) is close to the boundary of the frame. The

padded pixels (if it is necessary) as well as those pixels close

to the boundary can be simultaneously utilized during the

coefﬁcient derivation of the AR model. If there are no solutions

in (9) or (14), we will use the traditional methods (BMA,

method in [12], or STBMA) to restore the missing blocks

accordingly.

V. Experimental Results and Analysis

In this section, various experiments were conducted to

verify the performance of the proposed AR model based

error concealment scheme. H.264/AVC reference software JM

10.0 is utilized to evaluate the proposed algorithm; however,

it should be noted that the proposed algorithm can be ex-

tended to any block-based video compression scheme. We

compare the performance of the proposed algorithm with the

inter-frame error concealment schemes implemented in the

reference software, which are based on the classical BMA

[18], the method in [12], and STBMA [28]. We test the

performance on seven video sequences: QCIF: Mobile,Paris,

Suzie,Football and CIF: Foreman,Mobile,Flower. All the

test sequences are encoded at 30 Hz. The ﬁrst 120 frames

of each test sequence are encoded, where no B frames are

utilized. Slice mode is enabled and no intra mode is used

in P frames. Each row of MBs composes a slice and is

transmitted in a separate packet. The packet loss rates (PLR)

at 5%, 10% and 20% [37] are tested in the experiments.

Quantization parameters (QP) are set to be 16, 24, 28 and

40, respectively.

In all the following experiments, parameter R is set to be

1, and parameter M is set to be 4 and 8 for QCIF and CIF

sequences respectively.

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 23

In the experiments, we will show the effect of the proba-

bilistic conﬁdence ﬁrst, and then we will give the comparisons

of the proposed algorithm in terms of objective and subjective

criteria, and ﬁnally we will present the computational com-

plexity analysis.

A. Probability Conﬁdence Effects

In this subsection, we provide comparisons of regression

results under spatial and temporal continuity constraints as

well as the merged results with and without probabilistic

conﬁdence, respectively. The encoding group of picture (GOP)

is set to be IPPP..., where I frames are encoded every 16

frames. The transmission errors are assumed to only occur in P

frames and the PLR is 10%. The MV is estimated by BMA and

STBMA. The average PSNR results of the ﬁrst 120 restored

frames within each test sequence are provided in Table I. It is

noted that BMA+AR and STBMA+AR represent that BMA

and STBMA are utilized to obtain the MV of the corrupted

block before AR model is applied, respectively. “Spatial”

and “temporal” represent that AR coefﬁcients are derived

under spatial continuity constraint and temporal continuity

constraint, respectively. And “combined” represents the com-

bined results of “spatial” and “temporal.” “Uniform weight”

represents that equal probabilistic conﬁdence is assigned to

all the training samples. “Proposed weight” represents that

the proposed probabilistic conﬁdence scheme is applied to

all the training samples. It can be observed that for all the

test sequences, the PSNR results get improved when the

proposed probabilistic conﬁdence scheme is applied except

that there is a about 0.09 dB loss for Mobile (QCIF) when

QP is 28. Especially for Mobile (CIF), when QP is set to

be 24 and under BMA+AR, the PSNR gains are 0.11 dB,

0.12 dB and 0.19 dB for the spatial, temporal and combined

methods, respectively. And for Paris (QCIF), when QP is set

to be 24 and under BMA+AR, the PSNR gains are 0.16 dB,

0.13 dB and 0.19 dB for the spatial, temporal and combined

methods, respectively. This mainly beneﬁts from the fact

that by assigning proper probabilistic conﬁdence to different

training samples, the accuracy of the derived AR coefﬁcients

can be improved.

B. Objective and Subjective Evaluation

In this subsection, we will give the subjective and objec-

tive comparison results. BMA, the passive error concealment

in [12], STBMA+PDE, STBMA+ overlapped block motion

compensation (OBMC) [38] and the proposed AR model with

BMA and STBMA are utilized to restore the corrupted frames

within each test sequence.

Tables II and III present the average PSNR results of each

test sequence using different methods. The encoding GOP is

set to be IPPP...,where I frames are encoded every 16 frames.

The transmission errors are assumed to only occur in P frames.

It is observed that [12] and BMA achieve similar performance

for QCIF sequences, but [12] has some degradation for CIF

sequences. STBMA outperforms BMA in most cases. This is

because more spatial and temporal information is utilized in

STBMA during block matching. Applying PDE and OBMC to

STBMA, the performance can be further improved; however

the performance gain is not too much.

When applying the proposed AR model (BMA+AR or

STBMA+AR), the error concealment performance is able to

get a signiﬁcant improvement. This is due to the fact that

the AR model is able to adaptively adjust the coefﬁcients ac-

cording to the spatio-temporal coherence information. Another

observation is that under BMA+AR combination, the proposed

AR model is able to outperform STBMA. It strongly conﬁrms

that the proposed AR model is able to remedy the inference of

BMA results brought by the inaccurate MVs. This is because

although STBMA is able to achieve more accurate MVs, it

does not guarantee better replenishing results for anisotropic

regions, due to the ﬁxed interpolation taps along horizontal

and vertical directions. In contrast, the proposed AR model can

perform the interpolation along arbitrary directions by properly

tuning the coefﬁcients. In addition, we also found that the com-

bined results achieve better performance than those just under

spatial or temporal continuity constraint. This is mainly at-

tributed to the fact that combination operation is of higher abil-

ity to capture the variation properties of local image structure.

Fig. 7 shows the PSNR performance of Mobile (CIF)

and Flower (CIF) versus the frame number. It is noted

that both BMA+AR and STBMA+AR represent the com-

bined results. We can see that STBMA has better perfor-

mance than BMA and [12], while with the AR model (both

BMA+AR and STBMA+AR), the performance can be further

improved. Especially for the frames around the tenth frame

in Fig. 7(a), the PSNR gains achieved by the proposed AR

model (BMA+AR and STBMA+AR) are more than 2 dB

compared with other competing methods (e.g., the BMA, [12],

STBMA, STBMA+OBMC and STBMA+PDE). In addition,

for the frames around the 105th frame in Fig. 7(b), the

PSNR gains achieved by the proposed AR model (BMA+AR

and STBMA+AR) are more than 1 dB compared with other

competing methods.

To better represent the superior performance of the proposed

AR model, we give the subjective quality comparisons for

Foreman (CIF) and Mobile (CIF) in Figs. 8 and 9, respectively.

Here the AR model is applied after the MV is found via

STBMA. It is noted that there are consecutive slice errors in

Fig. 8. For the corrupted MBs in the upper row, only the MVs

of their upper neighboring MBs and the zero MVs are utilized

to generate the optimal MVs in BMA or STBMA. Similarly,

for the corrupted MBs in the lower row in the consecutive

slice errors in Fig. 8, only the MVs of their lower neighboring

MBs and the zero MVs are utilized to generate the optimal

MVs in BMA or STBMA. In Fig. 8, for BMA, STBMA,

STBMA+PDE, and STBMA+OBMC methods, we can easily

observe the blocking artifacts caused by motion, as shown

the regions surrounded by the red ellipse. In the replenished

image using the proposed AR model under spatial continuity

constraint, the blocking artifact is weakened, although it can

still be observed. And in the replenished images using the

proposed AR model under temporal continuity constraint

and combining spatial and temporal continuity constraints,

the blocking artifacts are completely removed. In Fig. 9,

for the BMA, STBMA, STBMA+PDE, and STBMA+OBMC

24 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

TABLE V

Average PSNR Results of Each Test Sequence When Both I and P Frames Suffer Loss During Transmission

Sequence QP PSNR (dB)

BMA [12] STBMA STBMA+OBMC STBMA+PDE BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

16 23.40 22.17 23.45 23.58 23.54 23.21 22.96 23.55 23.24 23.09 23.60

Football 24 23.30 22.09 23.40 23.42 23.41 22.79 22.70 23.43 22.84 22.85 23.45

28 22.89 21.76 22.37 22.72 22.73 22.67 22.57 22.84 22.63 22.61 22.88

40 21.93 21.84 22.67 22.71 22.79 22.38 22.31 22.81 22.63 22.63 22.84

16 22.31 22.40 22.99 22.97 22.99 22.94 23.15 23.18 23.03 23.20 23.25

Mobile 24 22.10 22.37 22.72 22.74 22.75 22.66 22.80 22.81 22.64 22.73 22.83

28 22.01 22.22 22.51 22.52 22.51 22.44 22.52 22.57 22.45 22.49 22.53

QCIF 40 19.67 19.70 19.81 19.83 19.85 19.77 19.74 19.91 19.78 19.75 19.98

16 23.01 23.17 23.39 23.43 23.41 23.44 23.44 23.68 23.46 23.46 23.68

Paris 24 22.35 22.76 23.09 23.13 23.15 22.89 23.04 23.19 23.14 23.41 23.43

28 22.44 22.49 22.82 22.85 22.84 22.85 23.02 23.07 22.88 23.02 23.07

40 20.48 20.47 20.67 20.66 20.66 20.62 20.52 20.67 20.63 20.56 20.69

16 30.04 30.20 30.43 30.50 30.47 30.21 30.27 30.62 30.27 30.35 30.71

Suzie 24 29.41 29.31 29.45 29.49 29.50 29.35 29.19 29.52 29.42 29.16 29.56

28 28.78 29.04 28.96 29.00 28.99 28.96 28.91 29.14 28.98 28.90 29.19

40 26.22 26.09 26.24 26.25 26.25 26.11 26.06 26.24 26.14 26.09 26.28

Average 23.77 23.63 24.06 24.11 24.12 23.96 23.95 24.20 24.01 24.02 24.25

16 27.74 27.84 27.70 27.71 27.71 27.74 27.80 27.98 27.70 27.56 27.77

Foreman 24 27.37 27.46 27.01 27.01 27.01 27.37 27.58 27.69 27.01 27.34 27.28

28 26.89 26.90 26.95 26.95 26.96 26.89 27.10 27.17 26.95 27.09 27.08

40 25.35 25.36 25.27 25.28 25.28 25.35 25.50 25.60 25.27 25.50 25.53

16 23.69 23.64 23.82 23.82 23.83 23.69 23.95 23.95 23.82 24.06 24.00

CIF Mobile 24 23.42 23.23 23.43 23.43 23.44 23.42 23.63 23.66 23.43 23.65 23.62

28 23.21 23.13 23.30 23.30 23.30 23.21 23.49 23.48 23.30 23.53 23.48

40 21.14 21.13 21.10 21.11 21.11 21.14 21.24 21.26 21.10 21.17 21.19

16 29.04 28.76 29.83 29.84 29.84 29.04 28.98 29.09 29.83 29.39 29.88

Flower 24 28.43 28.25 29.25 29.26 29.15 28.43 28.42 28.69 29.25 28.91 29.30

28 27.60 27.52 28.18 28.19 28.19 27.60 27.67 27.94 28.18 27.88 28.24

40 23.54 23.21 23.54 23.55 23.54 23.54 23.53 23.67 23.54 23.37 23.53

Average 25.62 25.54 25.78 25.79 25.78 25.62 25.74 25.85 25.78 25.79 25.91

methods, we cannot observe the ﬁgures “1” in “31,” as shown

the regions surrounded by the red circle. However, in the

replenished images by the proposed AR model, ﬁgures “1”

in “31” can be clearly observed.

We also conducted another experiment excluding error prop-

agation effects. The encoding GOP is set to be IPPIPPI...,and

we assume the error only occurs at the second P of each GOP

with the PLR being 10%. The experimental result is provided

in Table IV, from which we can observe that the proposed AR

model still outperforms other comparing methods in average.

Table V further tabulates the PSNR results of each test

sequence when both I and P frames suffer loss during trans-

mission. The PLR is 10% and the encoding GOP is IPPP...,

where I frames are inserted every 16 frames. The ﬁrst I frame

is assumed to be error free. To give a fair comparison of

different inter frame concealment methods, the errors in I

frames are replenished using the weighted pixel averaging

method in the reference software JM10.0, and the errors in

P frames are restored utilizing the proposed AR models and

other competing methods. From the experimental results we

can observe that the performance of all the inter frame error

concealment methods drop dramatically. This is due to the

reason that the badly concealed MBs in I frames would greatly

degrade the quality of the following P frames. However, the

proposed AR model still has a little better performance than

other competing methods, although the performance gain is

rather small compared to the case when I frames are error free.

C. Computational Complexity Analysis

Most computational complexity of the proposed AR model

based error concealment scheme is concentrated on the calcu-

lation of AR coefﬁcients. Take (9) for example, the AR co-

efﬁcient derivation involves matrix multiplication and inverse

matrix operations. It should be noted that the dimension of

matrix in (9) depends on the range of the AR model. The

smaller the range of the AR model, the lower computational

complexity it would be of.

Besides, there are many fast algorithms, e.g., [29] to speed

up the calculation of AR coefﬁcients. In Table VI, we examine

the consumed time of decoding the ﬁrst 120 frames of each

test sequence using different error concealment methods (the

encoding GOP is IPPP..., with I frames being inserted every

sixteen frames) on a typical computer (2.5 GHz Intel Dual

Core, 2 GB Memory). Except for BMA, which owns the

lowest computational complexity, [12] consumes fewer time

than other comparing methods. STBMA and STBMA+OBMC

have similar computational complexity. STBMA+PDE takes

longer time than the former four due to the iteration

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 25

TABLE VI

Average System Time Using Different Methods

Sequence QP PLR Time (s)

BMA [12] STBMA STBMA+PDE STBMA+OBMC BMA+AR STBMA+AR

Spatial Temporal Combined Spatial Temporal Combined

5% 0.953 1.028 0.982 1.077 0.984 0.999 1.029 1.105 1.045 1.076 1.087

24 10% 0.967 1.040 1.122 1.764 1.171 1.270 1.273 1.514 1.404 1.450 1.748

Football 20% 0.983 1.084 1.326 2.480 1.388 1.435 1.544 1.997 1.779 1.905 2.355

5% 0.874 0.942 0.904 1.045 0.915 0.921 0.983 1.014 0.999 0.998 1.028

28 10% 0.890 0.981 1.107 1.686 1.124 1.170 1.179 1.388 1.356 1.389 1.622

QCIF 20% 0.843 1.032 1.279 2.401 1.288 1.420 1.481 1.902 1.764 1.842 2.278

5% 1.029 1.146 1.061 1.170 1.092 1.114 1.193 1.255 1.119 1.078 1.155

24 10% 1.013 1.093 1.216 1.763 1.223 1.289 1.310 1.468 1.498 1.452 1.717

Mobile 20% 1.013 1.130 1.357 2.404 1.419 1.476 1.529 1.933 1.857 1.826 2.232

5% 0.889 0.994 0.982 1.108 0.984 1.015 0.998 1.088 1.046 0.998 1.062

28 10% 0.920 1.036 1.138 1.670 1.154 1.184 1.217 1.467 1.419 1.388 1.621

20% 0.967 1.070 1.296 2.310 1.357 1.487 1.489 1.841 1.794 1.749 2.201

5% 0.764 0.817 0.764 0.874 0.780 0.785 0.795 0.874 0.795 0.827 0.890

24 10% 0.654 0.816 0.843 1.372 0.847 0.998 1.031 1.311 1.092 1.169 1.405

Paris 20% 0.733 0.836 0.921 1.825 0.951 1.141 1.279 1.778 1.356 1.482 1.951

5% 0.671 0.756 0.702 0.827 0.716 0.774 0.699 0.843 0.778 0.796 0.857

28 10% 0.655 0.779 0.795 1.279 0.810 0.921 1.014 1.249 0.998 1.077 1.372

20% 0.686 0.798 0.797 1.731 0.905 1.269 1.279 1.652 1.297 1.467 1.918

5% 0.704 0.800 0.764 0.842 0.766 0.783 0.781 0.798 0.794 0.827 0.859

24 10% 0.749 0.827 0.889 1.311 0.894 0.999 1.046 1.279 1.092 1.202 1.436

Suzie 20% 0.749 0.848 0.936 1.793 1.014 1.198 1.310 1.778 1.435 1.530 1.981

5% 0.655 0.739 0.703 0.796 0.704 0.781 0.779 0.794 0.758 0.765 0.842

28 10% 0.670 0.764 0.733 1.232 0.795 0.889 0.999 1.201 1.061 1.077 1.373

20% 0.671 0.787 0.874 1.653 0.920 1.155 1.201 1.687 1.357 1.435 1.902

Average 0.821 0.923 0.979 1.517 1.008 1.103 1.143 1.384 1.246 1.284 1.537

CIF 5% 2.838 3.211 3.043 3.448 3.012 3.151 3.292 3.510 3.308 3.400 3.558

24 10% 2.979 3.306 3.697 5.695 3.681 4.167 4.991 5.990 4.804 5.600 6.520

Foreman 20% 2.978 3.457 4.133 8.004 4.306 4.976 6.677 8.394 6.243 7.597 9.281

5% 2.714 2.960 2.731 3.212 2.855 2.917 3.056 3.275 3.027 3.261 3.400

28 10% 2.745 3.190 3.369 5.273 3.400 3.933 4.883 5.804 4.462 5.211 6.396

20% 2.776 3.200 3.729 7.346 3.947 4.741 6.490 8.220 5.708 7.286 9.031

5% 3.821 4.654 3.916 4.415 3.916 4.055 4.071 4.276 4.181 4.228 4.461

24 10% 3.808 4.233 4.632 6.973 4.647 4.977 5.631 6.724 5.757 6.350 7.362

Mobile 20% 3.807 4.203 5.228 9.296 5.398 5.711 7.082 8.939 7.224 8.330 10.063

5% 3.480 3.667 3.681 4.121 3.618 3.728 3.840 3.977 3.821 3.964 4.103

28 10% 3.541 3.832 4.305 6.645 4.371 4.742 5.368 6.443 5.476 5.975 7.051

20% 3.634 3.950 4.993 9.019 5.102 4.554 6.863 8.675 6.865 7.925 9.812

5% 3.244 3.525 3.353 3.681 3.339 3.448 3.558 3.759 3.541 3.651 3.899

24 10% 3.229 3.498 3.743 5.522 3.821 4.338 5.272 6.209 4.881 5.773 6.833

Flower 20% 3.215 3.570 4.212 7.503 4.307 5.148 6.897 8.566 6.038 7.862 9.610

5% 3.025 3.221 3.043 3.448 3.089 3.260 3.336 3.542 3.290 3.541 3.667

28 10% 3.057 3.308 3.590 5.319 3.603 4.180 5.148 6.039 4.711 5.615 6.647

20% 3.104 3.397 3.993 7.036 4.072 4.945 6.726 8.456 5.804 7.706 9.392

Average 3.222 3.577 3.855 5.886 3.916 4.276 5.177 6.155 4.952 5.738 6.727

operation in PDE. The computational complexity of the pro-

posed AR model is higher than the comparing methods,

but still acceptable, especially when only spatial continuity

constraint or temporal continuity constraint is applied.

VI. Conclusion

In this paper, we developed an AR model based error

concealment scheme for block-based packet video coding. For

each corrupted block, we ﬁrst derived the motion vector and

then replenished each corrupted pixel as the weighted summa-

tion of pixels within a square centered at the pixel indicated

by the derived motion vector with integer-PEL accuracy in

a regression manner. To obtain better concealment results,

we proposed two block-dependent AR coefﬁcient derivation

algorithms under spatial and temporal continuity constraints.

We then combined the regression results generated by the

two algorithms to form the ultimate concealment results. The

simulation results demonstrate the superiority of the proposed

scheme over other inter-frame concealments with acceptable

computational complexity.

References

[1] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview

of the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst.

Video Technol., vol. 13, no. 7, pp. 560–576, Jul. 2003.

26 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 1, JANUARY 2012

[2] Coding of Moving Pictures and Associated Audio for Digital Storage

Media at up to About 1.5 mbit/s—Part 2: Video, ISO/IEC 11 172-2

(MPEG-1), Int. Standards Organization/Int. Electrotechnical Commis-

sion (ISO/IEC) JTC 1, Mar. 1993.

[3] Generic Coding of Moving Pictures and Associated Audio Information—

Part 2: Video, Rec. H.262 and ISO/IEC 13 818-2 (MPEG-2 Video), Int.

Telecommunication Union-Telecommunication (ITU-T) and Int. Stan-

dards Organization/Int. Electrotechnical Commission (ISO/IEC) JTC 1,

Nov. 1994.

[4] Video Coding for Low Bit Rate Communication, Int. Telecommunication

Union-Telecommunication (ITU-T) Rec. H.263, 1995.

[5] Y. Wang, S. Wenger, J. Wen, and A. K. Katsaggelos, “Error resilient

video coding techniques,” IEEE Signal Process. Mag., vol. 17, no. 7,

pp. 61–82, Jul. 2000.

[6] Y. Wang and Q.-F. Zhu, “Error control and concealment for video

communication: A review,” Proc. IEEE, vol. 86, no. 5, pp. 974–997,

May 1998.

[7] Y. Wang, Q.-F. Zhu, and L. Shaw, “Maximally smooth image recovery

in transform coding,” IEEE Trans. Commun., vol. 41, no. 10, pp. 1544–

1551, Oct. 1993.

[8] W. Zhu, Y. Wang, and Q.-F. Zhu, “Second-order derivative-based

smoothness measure for error concealment in DCT-based codecs,” IEEE

Trans. Circuits Syst. Video Technol., vol. 8, no. 6, pp. 713–718, Oct.

1998.

[9] S. D. Rane, G. Sapiro, and M. Bertalmio, “Structure and texture ﬁlling-

in of missing image blocks in wireless transmission and compression,”

IEEE Trans. Image Process., vol. 12, no. 3, pp. 296–303, Mar. 2003.

[10] W. Y. Kung, C. S. Kim, and C. J. Kuo, “Spatial and temporal error

concealment techniques for video transmission over noisy channels,”

IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 7, pp. 789–802,

Jul. 2006.

[11] X. Li and M. Orchard, “Novel sequential error concealment techniques

using orientation adaptive interpolation,” IEEE Trans. Circuits Syst.

Video Technol., vol. 12, no. 10, pp. 857–864, Oct. 2002.

[12] P. Salama, N. B. Shroff, and E. J. Delp, “Error concealment in MPEG

video streams over ATM networks,” IEEE J. Sel. Areas Commun., vol.

18, no. 6, pp. 1129–1144, Jun. 2000.

[13] Y. C. Lee, Y. Altunbasak, and R. M. Mersereau, “Multiframe error con-

cealment for MPEG-coded video delivery over error-prone networks,”

IEEE Trans. Image Process., vol. 11, no. 11, pp. 1314–1331, Nov. 2002.

[14] D. Persson, T. Eriksson, and P. Hedelin, “Packet video error concealment

with Gaussian mixture models,” IEEE Trans. Image Process., vol. 17,

no. 2, pp. 145–154, Feb. 2008.

[15] D. Persson and T. Eriksson, “Mixture model and least squares based

packet video error concealment,” IEEE Trans. Image Process., vol. 18,

no. 5, pp. 1048–1054, May 2009.

[16] P. Haskell and D. Messerschmitt, “Resynchronization of motion com-

pensated video affected by ATM cell loss,” in Proc. IEEE ICASSP, vol.

3. May 1992, pp. 545–548.

[17] M. J. Chen, L. G. Chen, and R. M. Weng, “Error concealment of lost

motion vectors with overlapped motion compensation,” IEEE Trans.

Circuits Syst. Video Technol., vol. 7, no. 3, pp. 560–563, Jun. 1997.

[18] W. M. Lam, A. R. Reibman, and B. Liu, “Recovery of lost or erroneously

received motion vectors,” in Proc. IEEE Int. Conf. Acoust. Speech Signal

Process., vol. 3. Apr. 1993, pp. 417–420.

[19] S. Tsekeridou, F. A. Cheikh, M. Gabbouj, and I. Pitas, “Motion

ﬁeld estimation by vector rational interpolation for error concealment

purposes,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., vol.

6. Mar. 1999, pp. 3397–3400.

[20] M. Al-Mualla, N. Canagarajahm, and D. R. Bull, “Error concealment

using motion ﬁeld interpolation,” in Proc. IEEE Int. Conf. Image

Process., vol. 3. Oct. 1998, pp. 512–516.

[21] J. H. Zheng and L. P. Chau, “A temporal error concealment algorithm for

H.264 using Lagrange interpolation,” in Proc. IEEE Int. Symp. Circuits

Syst., May 2004, pp. 133–136.

[22] Z. W. Gao and W. N. Lie, “Video error concealment by using Kalman

ﬁltering technique,” in Proc. IEEE Int. Symp.-Circuits Syst., May 2004,

pp. 69–72.

[23] W. N. Lie and Z. W. Gao, “Video error concealment by integrating

greedy suboptimization and Kalman ﬁltering techniques,” IEEE Trans.

Circuits Syst. Video Technol., vol. 16, no. 8, pp. 982–992, Aug.

2006.

[24] G. S. Yu, M. K. Liu, and M. Marcellin, “POCS-based error concealment

for packet video using multiframe overlap information,” IEEE Trans.

Circuits Syst. Video Technol., vol. 8, no. 4, pp. 422–434, Aug. 1998.

[25] D. Turaga and T. Chen, “Model based error concealment for wireless

video,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 6, pp.

483–495, Jun. 2002.

[26] Q. F. Zhu, Y. Wang, and L. Shaw, “Coding and cell-loss recovery in

DCT-based packet video,” IEEE Trans. Circuits Syst. Video Technol.,

vol. 13, no. 3, pp. 248–258, Jun. 1993.

[27] Y. Chen, X. Sun, F. Wu, Z. Liu, and S. Li, “Spatio-temporal video error

concealment using priority-ranked region-matching,” in Proc. IEEE Int.

Conf. Image Process., Sep. 2005, pp. 1050–1053.

[28] Y. Chen, Y. Hu, O. Au, H. Li, and C. Chen, “Video error concealment

using spatio-temporal boundary matching and partial differential equa-

tion,” IEEE Trans. Multimedia, vol. 10, no. 1, pp. 2–15, Jan. 2008.

[29] X. Wu, K. U. Barthel, and W. Zhang, “Piecewise 2-D autoregression

for predictive image coding,” in Proc. IEEE Int. Conf. Image Process.,

Oct. 1998, pp. 901–904.

[30] A. C. Kokaram, R. D. Morris, W. J. Fitzgerald, and P. J. W. Rayner,

“Detection of missing data in image sequences,” IEEE Trans. Image

Process., vol. 4, no. 11, pp. 1496–1508, Nov. 1995.

[31] A. C. Kokaram, R. D. Morris, W. J. Fitzgerald, and P. J. W. Rayner,

“Interpolation of missing data in image sequences,” IEEE Trans. Image

Process., vol. 4, no. 11, pp. 1509–1519, Nov. 1995.

[32] S. N. Efstratiadis and A. K. Katsaggelos, “A model-based PEL-recursive

motion estimation algorithm,” in Proc. IEEE ICASSP, Apr. 1990, pp.

1973–1976.

[33] X. Li, “Least-square prediction for backward adaptive video coding,”

EURASIP J. Appl. Signal Process. (Special Issue on H.264 and Beyond),

vol. 2006, no. 18, pp. 1–13, Mar. 2007.

[34] X. Xiang, Y. Zhang, D. Zhao, S. Ma, and W. Gao, “A high efﬁcient error

concealment scheme based on auto-regressive model for video coding,”

in Proc. PCS, May 2009.

[35] Y. Zhang, X. Xiang, S. Ma, D. Zhao, and W. Gao, “Auto regressive

model and weighted least squares based packet video error conceal-

ment,” in Proc. IEEE Data Compression Conf., Mar. 2010, pp. 455–464.

[36] Y. Zhang, D. Zhao, X. Ji, R. Wang, and W. Gao, “A spatio-temporal auto

regressive model for frame rate up-conversion,” IEEE Trans. Circuits

Syst. Video Technol., vol. 19, no. 9, pp. 1289–1301, Sep. 2009.

[37] S. Wenger, Error Patterns for Internet Experiments, ITU-T SG16

document Q15-I-16r1, 1999.

[38] M. T. Orchard and G. J. Sullivan, “Overlapped block motion compen-

sation: An estimation theoretic approach,” IEEE Trans. Image Process.,

vol. 3, no. 5, pp. 693–699, Sep. 1994.

Yongbing Zhang received the B.A. degree in En-

glish, and the M.S. and Ph.D degrees in computer

science from the Department of Computer Science,

Harbin Institute of Technology, Harbin, China, in

2004, 2006, and 2010, respectively.

He is currently with the Graduate School at

Shenzhen, Tsinghua University, Shenzhen, China.

His current research interests include video process-

ing, image and video coding, video streaming, and

transmission.

Xinguang Xiang received the B.S. and M.S. de-

grees in computer science from the Harbin Institute

of Technology, Harbin, China, in 2005 and 2007,

respectively. Since 2007, he has been pursuing the

Ph.D. degree from the Department of Computer Sci-

ence, School of Computer Science and Technology,

Harbin Institute of Technology.

His current research interests include video com-

pression, multi-view/stereoscopic video coding, and

robust video transmission.

ZHANG et al.: PACKET VIDEO ERROR CONCEALMENT WITH AUTO REGRESSIVE MODEL 27

Debin Zhao received the B.S., M.S., and Ph.D. de-

grees in computer science from the Harbin Institute

of Technology, Harbin, China, in 1985, 1988, and

1998, respectively.

He is currently a Professor with the Department of

Computer Science, Harbin Institute of Technology.

He has published over 200 technical articles in refer-

eed journals and conference proceedings in the areas

of image and video coding, video processing, video

streaming and transmission, and pattern recognition.

Siwei Ma (S’03) received the B.S. degree from

Shandong Normal University, Jinan, China, in 1999,

and the Ph.D. degree in computer science from

the Institute of Computing Technology, Chinese

Academy of Sciences, Beijing, China, in 2005.

From 2005 to 2007, he was a Post-Doctorate with

the University of Southern California, Los Ange-

les. Then, he joined the Institute of Digital Media,

School of Electronic Engineering and Computer

Science, Peking University, Beijing, where he is

currently an Associate Professor. He has published

over 70 technical articles in refereed journals and proceedings in the areas of

image and video coding, video processing, video streaming, and transmission.

Wen Gao (M’92–SM’05–F’09) received the M.S.

degree in computer science from the Harbin Institute

of Technology, Harbin, China, in 1985, and the

Ph.D. degree in electronics engineering from the

University of Tokyo, Tokyo, Japan, in 1991.

He is currently a Professor of computer science

with the Institute of Digital Media, School of Elec-

tronic Engineering and Computer Science, Peking

University, Beijing, China. Before joining Peking

University, he was a Full Professor of computer

science with the Harbin Institute of Technology from

1991 to 1995, and with the Chinese Academy of Sciences, Beijing, from 1996

to 2005. He has published extensively, including four books and over 500

technical articles in refereed journals and conference proceedings in the areas

of image processing, video coding and communication, pattern recognition,

multimedia information retrieval, multimodal interface, and bioinformatics.

Dr. Gao is the Editor-in-Chief of the Journal of Computer (a journal

of the China Computer Federation), an Associate Editor of the IEEE

Transactions on Circuits and Systems for Video Technology, IEEE

Transactions on Multimedia, IEEE Transactions on Autonomous

Mental Development, an Area Editor of the EURASIP Journal of Image

Communications, and an Editor of the Journal of Visual Communication

and Image Representation. He chaired a number of prestigious international

conferences on multimedia and video signal processing, and also served on the

advisory and technical committees of numerous professional organizations.

A review of temporal video error concealment techniques and their suitability for HEVC and VVC

Article

Full-text available

Jan 2021
MULTIMED TOOLS APPL

Despite of the recent progresses in reliable and high bandwidth communication, packet loss is still probable and needs special attention in real-time video streaming applications. Congestion and bit error rate, which sometimes are more than the protection capability of the channel codes, are the sources of packet loss in video communication. One common approach to deal with video packet loss is to use error concealment techniques, which estimate the non-received data as close as possible to the actual data. This article reviews the temporal video error concealment methods that have been developed over the past 30 years. The techniques are categorized into 8 groups, and the methods are covered with enough details. The strengths and weaknesses of the 8 groups are also tabulated, and some suggestions for future work and open areas for research are provided.

Content-based hybrid error concealment approach for packet video communication over the noisy channels

Article

Full-text available

Mar 2021
MULTIMED TOOLS APPL

Video compression makes the encoded video stream more vulnerable to the channel errors so that, the quality of the received video is exposed to severe degradation when the compressed video is transmitted over the error-prone environments. Therefore, it is necessary to apply error concealment (EC) techniques in the decoder to improve the quality of the received video. In this regard, an Adaptive Content-based EC Approach (ACBECA) is proposed in this paper, which exploits both the spatial and temporal correlations within the video sequences for the EC purpose. The proposed approach adaptively utilizes two EC techniques, including new spatial-temporal error concealment (STEC) technique, and a temporal error concealment (TEC) technique, to recover the lost regions of the frame. The STEC technique proposed in this paper is established on the basis of non-Local Means concept and tries to recover each lost macroblock (MB) as the weighted average of the similar MBs in the reference frame, whereas the TEC technique recovers the motion vector of the lost MB adaptively by analyzing the behavior of the MB in the frame. The decision on temporally or spatially reconstructing the degraded frames is made dynamically according to the content of the degraded frame (i.e., structure or texture), type of the error and also block loss rate (BLR). Compared with the state-of-the-art EC techniques, the simulation results indicate the superiority of the ACBECA in terms of both the objective and subjective quality assessments.

Error concealment with parallelogram partitioning of the lost area

Article

Full-text available

Mar 2020
MULTIMED TOOLS APPL

In this paper, we proposed a video error concealment algorithm using Motion Vector (MV) recovery for parallelogram partitions in the lost area. Error concealment is inevitable when some video packets are lost during transmission and correction or retransmission is not feasible. In conventional methods, MVs are recovered for the square shaped blocks which are then used for motion compensated temporal replacement. But in our proposed method, by parallelogram partitioning of the lost area, the MVs are found for more general shaped blocks. The parallelograms with various sizes and angles are examined, and then the best combination (size and angle) is selected with the assist of a border matching algorithm and a blind quality assessment method. Experimental results show that our method outperforms the other error concealment algorithms, both subjectively and objectively.

Image Error Concealment Based on Deep Neural Network

Article

Full-text available

Apr 2019

In this paper, we propose a novel spatial image error concealment (EC) method based on deep neural network. Considering that the natural images have local correlation and non-local self-similarity, we use the local information to predict the missing pixels and the non-local information to correct the predictions. The deep neural network we utilize can be divided into two parts: the prediction part and the auto-encoder (AE) part. The first part utilizes the local correlation among pixels to predict the missing ones. The second part extracts image features, which are used to collect similar samples from the whole image. In addition, a novel adaptive scan order based on the joint credibility of the support area and reconstruction is also proposed to alleviate the error propagation problem. The experimental results show that the proposed method can reconstruct corrupted images effectively and outperform the compared state-of-the-art methods in terms of objective and perceptual metrics.

Novel Pixel Recovery Method Based on Motion Vector Disparity and Compensation Difference

Article

Full-text available

Feb 2018

As compressed videos are transmitted in the communication networks, video packet loss inevitably occur. This problem can be solved by error concealment method. We used the motion vector of the available neighboring blocks to estimate the lost motion vector for the lost block. These estimates propagate to predict all other missing motion vectors. We further improved the work by using the idea of the motion vector disparities between neighboring available blocks to modify the motion vector weightings. Furthermore, the differences between the compensated pixels and the decoded pixels in the neighboring blocks are computed for another weighting for improvement. These two novelties are combined as a final indicator to prediction weightings. By comparison against the state-of-the-art method, the four proposed algorithms increases average PSNR by up to 1.86 dB, 1.93 dB, 1.94 dB and 2.04 dB on average, showing the gradual improvement of our design systems. For other video quality measurements, the average gains of the proposed work against the state-of-the-art work can be up to 0.0575 in SSIM, -0.0278 in VQM (the lower the better), -0.0008 in MOVIE (the lower the better), and 2.77 in subjective evaluation. The proposed work performs slightly worse than a pixel-based state-of-the-art method in PSNR and SSIM, but performs better in VQM, MOVIE (both correlate better with human perception) and subjective experiments, with much lower computational complexity.

Spiral-Like Pixel Reconstruction Algorithm for Spatiotemporal Video Error Concealment

Article

Full-text available

Jan 2018

This paper proposes efficient error concealment (EC) algorithms that use a spiral-like pixel reconstruction (SPR) scheme on the H.264/AVC joint model simulation platform. The algorithms provide low complexity and high accuracy, and their subjective and objective evaluation results are superior to those of extant EC algorithms. First, edge matching is applied to the boundary of lost macroblocks (MBs). Second, the directional edge group with the highest magnitude is selected and symmetric pixel referencing is performed along its orthogonal symmetry axis. Finally, the lost MBs are estimated using the novel SPR scheme with reference mode selection according to the edge ratio. Experimental results revealed that, compared with the extant edge-oriented PR approaches of EC, the proposed total- and partial-spiral ordering modes in pixel referencing can be used to reconstruct lost MBs, with a desirable decoder peak signal-to-noise ratio and high visual quality.

Image Error Concealment using Robust Data Hiding Technique

Conference Paper

Mar 2020

Efficient Recoverable Cryptographic Mosaic Technique by Permutations

Article

Feb 2020

Mosaic is a popular approach to provide privacy of data and image. However, the existing demosaicing techniques cannot accomplish efficient perfect-reconstruction. If the receiver wants to recover the original image, the extra transmission of the original subimage to be mosaicked is necessary, which consumes much channel resource and is therefore inefficient. In this paper, we propose a novel efficient recoverable cryptographic mosaic technique by permutations. A mosaic, or a privacy-protected subimage, can be constructed through either of the three permutations (Busch’s, Wu’s, and Sun’s/Minmax). These three permutations are designed to maximize the objective function as the sum of the absolute row/column index-differences. This objective is related to the sum of the pixel-to-pixel cross-correlation by our pertinent theoretical study. To measure the effectiveness of the image-mosaicing methods, we propose two image-discrepancy measures, namely summed cross-correlation (SCC) and Kullback-Leibler divergence of discrete cosine transform (DCT-KLD). Compared to the big majority of random permutations for image-mosaicing, our proposed three permutation methods can achieve much better performances in terms of SCC. Nevertheless, the advantage of the three proposed permutation methods over random permutations is not obvious according to DCT-KLD.

Spatial Error Concealment by Jointing Gauss Bayes Model and SVD for High Efficiency Video Coding

Article

Mar 2019
INT J PATTERN RECOGN

This paper proposes a novel sparsity-based error concealment (EC) algorithm which integrates the Gauss Bayes model and singular value decomposition for high efficiency video coding (HEVC). Under the sequential recovery framework, pixels in missing blocks are successively reconstructed in Gauss Bayes mode. We find that the estimation error follows the Gaussian distribution in HEVC, so the error pixel estimation problem can be transferred to a Bayesian estimation. We utilize the singular value decomposition (SVD) technique to select sample pixels, which yields high estimation accuracy and reduces estimation error. A new recovery order based on confidence is established to resolve the error propagation problem. Compared to other state-of-the-art EC algorithms, experimental results show that the proposed method gives better reconstruction performance in terms of objective and subjective evaluations. It also has significantly lower complexity.

Wiener-Based Inpainting Quality Prediction

Article

Full-text available

Oct 2017

A Wiener-based inpainting quality prediction method is presented in this paper. The proposed method is the first method that can predict inpainting quality both before and after the intensities have become missing even if their inpainting methods are unknown. Thus, when the target image does not include any missing areas, the proposed method estimates the importance of intensities for all pixels, and then we can know which areas should not be removed. Interestingly, since this measure can be also derived in the same manner for its corrupted image already including missing areas, the expected difficulty in reconstruction of these missing pixels is predicted, i.e., we can know which missing areas can be successfully reconstructed. The proposed method focuses on expected errors derived from the Wiener filter, which enables least-squares reconstruction, to predict the inpainting quality. The greatest advantage of the proposed method is that the same inpainting quality prediction scheme can be used in the above two different situations, and their results have common trends. Experimental results show that the inpainting quality predicted by the proposed method can be successfully used as a universal quality measure.

A model-based pel-recursive motion estimation algorithm

Conference Paper

Full-text available

May 1990
Acoust Speech Signal Process

A model-based approach to pel-recursive motion estimation is presented. The derivation of the algorithm is similar to the Wiener-based pel-recursive motion-estimation algorithm. However, the proposed algorithm utilizes the spatio-temporal correlations in an image sequence by considering an autoregressive (AR) model for the motion-compensated frames. Therefore, depending on the support of the AR model, the estimation is based on two or more consecutive frames. Pel-recursive motion-estimation algorithms which appear in literature are special cases of this algorithm. Various implementation issues such as adaptive regularization and modeling of the motion field are considered. Based on experiments with typical videoconferencing scenes, the conclusion is that the proposed algorithm performs better than the two-frame Wiener-based pel-recursive algorithm with respect to accuracy, robustness, and smoothness of the velocity field

Spatio-temporal video error concealment using priority-ranked region-matching

Conference Paper

Full-text available

Oct 2005

When transmitted over error-prone networks, compressed video sequences may be received with errors. In this paper, we propose a priority-ranked region-matching algorithm to recover the "lost" area of the decoded frames, in which both temporal and spatial correlations of the video sequence are exploited. In the proposed scheme, we first calculate the priorities of all edge pixels of the "lost" area and generate a priority-ranked region group. Then according to their priorities, the regions in the group will search their best matching regions temporally and spatially. Finally, the "lost" area is recovered progressively by the corresponding pixels in the matching regions. Experimental results show that the proposed scheme achieves higher PSNR as well as better video quality in comparison with the method adopted in H.264.

Error Concealment Using Motion Field Interpolation

Conference Paper

Full-text available

Nov 1998

In the context of block-based video coding, two error concealment algorithms are presented. The first algorithm is based on bilinear motion field interpolation (BMFI). For each pel in a damaged block, the algorithm recovers a motion vector using bilinear interpolation of neighbouring motion vectors. This vector is then used to conceal the damaged pel. A reduced complexity version of this algorithm is also presented. The second algorithm uses overlapped motion compensation to combine the first algorithm with a boundary matching error concealment algorithm. Simulation results show that at low error rates the first algorithm outperforms other concealment techniques, but its performance starts to deteriorate with increasing error rate. The second algorithm, however, maintains a superior performance regardless of the error rate. Simulation results within an H.263 codec are also presented

Auto Regressive Model and Weighted Least Squares Based Packet Video Error Concealment

Conference Paper

Full-text available

Jan 2010

In this paper, auto regressive (AR) model is applied to error concealment for block-based packet video encoding. Each pixel within the corrupted block is restored as the weighted summation of corresponding pixels within the previous frame in a linear regression manner. Two novel algorithms using weighted least squares method are proposed to derive the AR coefficients. First, we present a coefficient derivation algorithm under the spatial continuity constraint, in which the summation of the weighted square errors within the available neighboring blocks is minimized. The confident weight of each sample is inversely proportional to the distance between the sample and the corrupted block. Second, we provide a coefficient derivation algorithm under the temporal continuity constraint, where the summation of the weighted square errors around the target pixel within the previous frame is minimized. The confident weight of each sample is proportional to the similarity of geometric proximity as well as the intensity gray level. The regression results generated by the two algorithms are then merged to form the ultimate restorations. Various experimental results demonstrate that the proposed error concealment strategy is able to increase the peak signal-to-noise ratio (PSNR) compared to other methods.

Piecewise 2D autoregression for predictive image coding

Article

Jan 1998

Recovery of lost or erroneously received m otion vectors

Article

Resynchronization of motion compensated video affected by ATM cell loss

Article

Mar 1992

Techniques for resynchronizing motion-compensation-based coders and strategies for the recovery of lost motion vectors are discussed. Leaky-difference resynchronization yields perceptually pleasing video sequences even at fairly high cell loss rates. Future study is needed to determine optimal data-dependent or network-state-dependent conditional resynchronization strategies. Lost motion vectors can be predicted accurately with either the median of intraframe neighboring vectors or the corresponding past-frame vector. The replacement of lost motion vectors with estimates such as these can significantly improve the quality of video affected by cell loss.

A high efficient error concealment scheme based on auto-regressive model for video coding

Conference Paper

Jun 2009

In this paper, a high efficient temporal error concealment scheme based on auto-regressive (AR) model is proposed for video coding. The proposed AR based error concealment scheme includes a forward AR model for P slice, and a bi-direction AR model for B slice. First, we utilize the block matching algorithm (BMA) to select the best motions for lost blocks from the motions of available neighboring blocks. Then, the proposed AR model coefficients are computed according to the spatial neighboring pixels and their temporal-correlated pixels indicated by the selected best motions. Finally, applying the AR model, each pixel of the lost block is interpolated as a weighted summation of pixels in the reference frame along the selected best motions. Simulation results show that the performance of the proposed scheme is superior to conventional temporal error concealment methods.

A Spatio-Temporal Auto Regressive Model for Frame Rate Upconversion

Article

Oct 2009

This paper proposes a spatio-temporal auto regressive (STAR) model for frame rate upconversion. In the STAR model, each pixel in the interpolated frame is approximated as the weighted combination of a sample space including the pixels within its two temporal neighborhoods from the previous and following original frames as well as the available interpolated pixels within its spatial neighborhood in the current to-be-interpolated frame. To derive accurate STAR weights, an iterative self-feedback weight training algorithm is proposed. In each iteration, first the pixels of each training window in the interpolated frames are approximated by the sample space from the previous and following original frames and the to-be-interpolated frame. And then the actual pixels of each training window in the original frame are approximated by the sample space from the previous and following interpolated frames and the current original frame with the same weights. The weights of each training window are calculated by jointly minimizing the distortion between the interpolated frames in the current and previous iterations as well as the distortion between the original frame and its interpolated one. Extensive simulation results demonstrate that the proposed STAR model is able to yield the interpolated frames with high performance in terms of both subjective and objective qualities.

Novel sequential error concealment techniques using orientation adaptive interpolation

Conference Paper

Jan 2001
Proceedings of SPIE

This paper introduces a new framework of sequential error concealment techniques for block-based image coding systems. Unlike previous approaches which simultaneously recover the pixels inside the missing block, we propose to recover them in a sequential fashion. The structure of sequential recovery enhances the capability of handling complex texture patterns in the image and serious block loss situations during the transmission. Under the framework of sequential recovery, we present a novel spatially adaptive scheme to interpolate the missing pixels along the edge orientation. We also study the problem of how to fully exploit the information from the available surrounding neighbors with the sequential constraint. Experiment results have shown that novel sequential recovery techniques are superior to most existing parallel recovery techniques in terms of both subjective and objective quality of reconstructed images.

Packet Video Error Concealment With Auto Regressive Model

Abstract and Figures

Recommended publications

Auto Regressive Model and Weighted Least Squares Based Packet Video Error Concealment

Side information generation with auto regressive model for low-delay distributed video coding

Efficient Motion Vector Recovery Algorithm for H.264 Using B-Spline Approximation

A Motion-Aligned Auto-Regressive Model for Frame Rate Up Conversion