ArticlePDF Available

A concealment method for video communications in an error-prone environment

July 2000
IEEE Journal on Selected Areas in Communications 18(6):1122 - 1128

July 2000
18(6):1122 - 1128

DOI:10.1109/49.848261

Source
IEEE Xplore

Authors:

Shahram Shirani

McMaster University

Rabab K. Ward

University of British Columbia

In this paper, we propose a two-stage error-concealment method for block-based compressed video which was transmitted in an error-prone environment. In the first stage, we obtain initial estimates of the missing blocks. If the motion vectors associated with the missing blocks are available, motion compensation is used to provide good estimates. Otherwise, a novel algorithm which preserves image continuity is used to estimate the blocks. In the second stage, a maximum a posteriori (MAP) estimator, which employs an adaptive Markov random field (MRF) as the image a priori model is used to improve the video reconstruction quality. The adaptive model enables the estimation to incorporate information embedded not only in the immediate neighborhood pixels but also in a wider neighborhood into the reconstruction procedure without increasing the order of the MRF model. The proposed concealment method achieves very good computation-performance tradeoffs, as demonstrated via experimental results

(a) The four subblocks, (b) their corresponding subblocks in the previous frame, and the blocks connected to them.

…

A pixel, its clique c, and the eight directions. The complement of the clique c is the dark area.

…

PSNR values for image sequence Foreman with 10% GOB missing for different concealment methods.

…

An inter-coded frame of the sequence Foreman concealed by the (a) replacement, (b) median, and (c) proposed methods.

…

A frame from the video sequence Foreman: (a) original; (b) missing blocks; reconstructed using (c) a nonadaptive GMRF model; (d) a suboptimal MRF model; (e) the method proposed in [4]; (f) the method proposed in [3]; and (g) our adaptive MRF model.

…

Figures - uploaded by Rabab K. Ward

Content may be subject to copyright.

Content uploaded by Rabab K. Ward

Content may be subject to copyright.

Content uploaded by Rabab K. Ward

Content may be subject to copyright.

1122 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

A Concealment Method for Video Communications

in an Error-Prone Environment

Shahram Shirani, Member, IEEE, Faouzi Kossentini, Senior Member, IEEE, and Rabab Ward, Fellow, IEEE

Abstract—In this paper, we propose a two-stage error-conceal-

ment method for block-based compressed video which was trans-

mitted in an error-prone environment. In the first stage, we obtain

initial estimates of the missing blocks. If the motion vectors associ-

ated with the missing blocks are available, motion compensation

is used to provide good estimates. Otherwise, a novel algorithm

which preserves image continuity is used to estimate the blocks. In

the second stage, a maximum a posteriori (MAP) estimator, which

employs an adaptive Markov random field (MRF) as the image a

priori model, is used to improve the video reconstruction quality.

The adaptive model enables the estimation to incorporate infor-

mation embedded not only in the immediate neighborhood pixels

but also in a wider neighborhood into the reconstruction proce-

dure without increasing the order of the MRF model. The pro-

posed concealment method achieves very good computation–per-

formance tradeoffs, as demonstrated via experimental results.

Index Terms—Error concealment, image reconstruction,

Markov random fields.

I. INTRODUCTION

DIGITAL image and video signals require very high bit

rates, thus, compression of such signals before their trans-

mission is necessary. Communication channels are not error

free and, consequently, the encoded bit streams are vulnerable

to transmission errors—usually causing loss of blocks of data

and/or loss of synchronization. Despite the various methods that

have been proposed to combat or localize the effects of channel

errors, the quality of a decoded video sequence may degrade

significantly because of the residual errors. Error concealment

intends to conceal the effects of such errors by exploiting redun-

dancies in the video signal and limitations of the human visual

system, without requiring additional information.

Temporal concealment methods are usually used for error

concealment in inter-coded frames. Data of an inter-coded block

are composed of the motion vector and the DCT coefficients of

the prediction error. If the motion vectors are received without

errors, the missing blocks are set to their corresponding mo-

tion compensated blocks. Since in many of the video compres-

sion algorithms (e.g., H.263) the coded motion vectors and the

DCT coefficients are interleaved, the loss of data of a block

usually results in the loss of both the motion vectors and the

Manuscript received May 4, 1999; revised November 30, 1999. This work

was supported by MDSI, BC ASI, and NSERC.

The authors are with the Department of Electrical and Computer Engineering,

University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.

Publisher Item Identifier S 0733-8716(00)04337-7.

DCT coefficients.1Therefore, most of the proposed temporal

concealment methods first estimate the motion vector associ-

ated with a missing block using the motion vectors of adjacent

blocks [1], [2]. The estimated motion vector is then used to find

a block in the previous frame which yields the restoration of

the missing block. One problem associated with these methods

is that if the adjacent blocks are coded in a nonpredictive way

(e.g., intra-coded), there would not be any data available to esti-

mate the missing motion vectors. Thus, ad hoc assumptions for

estimating the missing motion vector should be made, and the

results may be unreliable. Moreover, if the missing block lies

on the boundary of two objects moving in opposite directions,

such methods will perform quite poorly.

Spatial error-concealment methods restore the missing

blocks using the information in the current frame. In [3],

to restore the missing data, a measure of variations (e.g.,

gradient or Laplacian) between adjacent pixels is minimized.

The underlying smoothness assumption of this method limits

its ability in restoring image details. In [4], each pixel in a

damaged block is interpolated from the corresponding pixels in

its four neighboring blocks such that the total squared border

error is minimized. In [5] and [6], the missing information

is interpolated utilizing spatially correlated edge information

from a large local neighborhood. Note that although these

edge-based methods are generally more accurate than other

approaches, they are computationally more intensive. In [7],

a computationally simple, spatial directional interpolation

scheme has been proposed. The two nearest surrounding

layers of pixels of a missing block are converted into a binary

pattern to reveal the local geometrical structure. Then, the

missing pixels are interpolated in a way to preserve the local

geometrical structures. In the statistical error concealment

methods, for example the one proposed in [8], it is assumed

that the pixel values in an image or video signal are realizations

of an underlying statistical model (e.g., MRF model). The sta-

tistical approaches of spatial error concealment are expected to

outperform the deterministic ones, as they provide a systematic

way for incorporating the a priori information about the video

signal in the restoration procedure.

Previous spatial error-concealment methods employing the

MRF model usually yield blurry images with a significant loss

of detail in the high frequency or edge portions of the image.

This is due to 1) the type of the MRF selected as the image

model, and 2) the fact that the amount of information that is

1Recognizing the need to improve the concealment capabilities, in the new

generation of video compression standards (e.g., MPEG 4, H.263++), the mo-

tion information is separated from other information through partitioning of the

coded data.

SHIRANI et al.: CONCEALMENT METHOD FOR VIDEO COMMUNICATIONS 1123

used in the reconstruction process is often restricted to a single-

pixel wide region around the erroneous area. Incorporating more

pixels in order to enhance restoration of edges and high-fre-

quency parts would usually require a higher order MRF model.

However, this is computationally expensive, as the complexity

grows exponentially with the order of the MRF model.

In this paper, we propose a two-stage error-concealment

method for compressed video. In the first stage, if applicable,

we use the information in the previous frame to obtain initial

estimates of the missing blocks. If the motion vectors of the

missing blocks are available, motion compensation is used to

provide the estimates. Otherwise, an algorithm which preserves

image continuity is employed to compute the initial estimates.

In the second stage, a MAP estimator is used for refinement of

the initial estimates. The MAP estimator employs an adaptive

MRF as the image a priori model. The proposed adaptive MRF

takes into account the local image characteristics embedded

not only in the immediate neighborhood pixels of the damaged

area but also in a wider neighborhood without a dramatic in-

crease in computational complexity. Our concealment method

improves on the existing statistical methods in that it yields

good reconstruction performance regardless of the content of

the missing blocks.

To make the compressed data more error resilient, most of

the standard-compliant video compression systems partitioned

a video frame into Groups of Blocks (GOB’s) or “slices,” which

are coded independently. Therefore, the output bit stream usu-

ally consists of segments separated with markers, where each

segment corresponds to the coded data of the blocks in a GOB or

a slice. When channel errors occur, the decoder usually discards

the erroneous data between two markers surrounding the erro-

neous data, effectively discarding the GOB or the slice. Then,

loss of data of a slice does not affect the rest of the compressed

video sequence. In our work, it is assumed that the frames of

the coded video sequence are partitioned into GOB’s or slices

and, thus, the missing data belong to blocks of a GOB or a slice.

Moreover, it is assumed that the decoder knows the locations of

the missing blocks. This information (e.g., the checksum infor-

mation) can be obtained from the network or it can be inferred,

for example, by detecting the semantic or syntactic violations as

a result of errors [9].

The rest of this paper is organized as follows. In Section II,

MAP estimation of missing data is briefly reviewed. Section III

presents the proposed error concealment method. Section IV

discusses the computational complexity of the proposed

method. Sections V and VI present our experimental results

and conclusions, respectively.

II. MAP ESTIMATION OF MISSING DATA

Let the corrupted image be represented by the vector , while

the decompressed image without errors can be represented by

. Using MAP estimation, the decompressed image estimate is

given by

Using the Bayes’ rule we, obtain

(1)

where the term has been dropped because it is inde-

pendent of . The corrupted image can be expressed as

, where is the transform that converts the decompressed

image into the corrupted one .2The conditional probability

of the corrupted image can be then written as follows:

(2)

Substituting (2) in (1), the MAP estimation becomes

(3)

where .

For image a priori model , we use the MRF.

According to the Hammersley–Clifford theorem, every MRF

should have a Gibbs distribution [10] in the following form:

(4)

where is a normalizing constant, is called a potential

function and is a function of a local group of pixels called

clique, and denotes the set of all cliques throughout the image.

The potential function characterizes the relationship between a

group of pixels by assigning larger costs to configurations of

pixels which are less likely to occur. The choice of the poten-

tial function impacts substantially the performance of the image

model. The function should be convex in order

to have an easily obtainable global minimum. Otherwise, local

minima will be present, and the function must then be mini-

mized via a computationally expensive technique such as simu-

lated annealing. Commonly, the potential functions are selected

to be of the form

(5)

where is a coefficient vector, is the vector of pixels in the

clique , and is the cost function. The convexity propertyof

the cost function insures that the remains convex.

The coefficients are usually selected such that provides

an approximation of the first or second derivative of the image

at each pixel. For the special case of , the model is

called a Gauss–Markov random field (GMRF).

III. PROPOSED METHOD

As mentioned earlier, our method consists of two stages. The

purpose of the first stage of our method is to obtain an initial es-

timate of the missing block using information from the previous

frame. If the missing block has been intra-coded, the initial esti-

mate is set to zero. If the missing block has been inter-coded, and

its motion vector is received correctly (e.g., because the encoded

2See [8] for a more detailed discussion of the transform.

1124 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

(a)

(b)

Fig. 1. (a) The four subblocks, (b) their corresponding subblocks in the

previous frame, and the blocks connected to them.

data have been partitioned), then the estimate is set to the corre-

sponding motion compensation block. When the motion vector

is not available, we find an estimate of the missing block such

that image continuity inside the block and across its boundaries

is preserved [11]. To do this, subblocks adjacent to the missing

block, i.e., , and in Fig. 1(a), are considered. First,

for each of these subblocks, a corresponding subblock in the pre-

vious frame is determined. The corresponding subblock is found

by searching a small area around the point corresponding to the

center of each of the subblocks in the previous

frame. The sum of absolute differences is used as the measure

of similarity. The four subblocks and their corre-

sponding subblocks in the previous frame are

shown in Fig. 1(a) and (b). For example, is the subblock cor-

responding to . Then, four blocks, namely , and

, which are connected to , and (respectively),

are determined. To obtain an initial estimate of the missing block

that smoothly connects to the rest of the image, a block from the

above four blocks that minimizes the squared sums of border er-

rors, between the estimated block and its adjacent above and left

blocks, is selected. Thus,

where

(6)

Each of the border errors and is defined in terms of pixels

and

Fig. 2. A pixel, its clique

, and the eight directions. The complement of the

clique

is the dark area.

where the vector consists of the bottom line pixels of the

block to the top of the missing block, and consists of the

right column pixels of the block to the left of missing block.

The vectors and are those elements of the estimated block

that correspond to the pixels in its top row and left column,

respectively. To ensure the online applicability (e.g., conceal-

ment while decoding the blocks in raster scan), only subblocks

to the left and above the missing block are used. If the condi-

tion of online concealment is relaxed, the same method can be

extended to include the subblocks of the blocks to the right and

below of the missing block.

The estimation of motion of the subblocks has a high compu-

tational overhead which can possibly introduce unacceptable re-

quirements on decoder. The computational overhead can be re-

duced if the search for displacement of each of the subblocks is

restricted to a set of candidate motion vectors. This is a decoder

option and can be used to trade performance against computa-

tional complexity. The set consists of the motion vector of the

block corresponding to the missing block in the previous frame,

the motion vectors of available neighboring blocks, the median

of the motion vectors of available neighboring blocks, the av-

erage of the motion vectors of available neighboring blocks, and

the zero motion vector [2].

In the second stage of our proposed error-concealment

method, the information around each missing block is used

to refine the initial estimate. This stage is based on a MAP

estimator. We consider a GMRF model with an eight-pixel

clique around each pixel as the image a priori model (see

Fig. 2). The reason for selecting an eight-pixel clique in the

manner shown in Fig. 2 will become clear later. The potential

function of (5) is selected as

(7)

where is the clique, is the weight assigned to the

difference between the values of the pixel in position and

the pixel in its clique at position and .

Combining the MAP estimator of (3) with the image model

[(4) and (7)], the restoration of missing data eventuates in the

following minimization problem:

(8)

SHIRANI et al.: CONCEALMENT METHOD FOR VIDEO COMMUNICATIONS 1125

where is the set of all missing pixels in the frame. Since

is a convex function, the above minimization

problem yields a unique global solution. In fact, the estimated

value of a pixel is given by

(9)

where is the clique and is its complement shown in Fig. 2.

In our adaptive MRF model, the weight corresponding to the

difference between a pixel and one of the pixels in its clique

[in (9)] is selected adaptively, based on the likelihood

of an edge in the direction of the subject pair of pixels. The ra-

tionale behind this selection is to weigh more the difference be-

tween the pixels in that direction which will cause the values

of the pixels in that direction to get closer to each other. The

likelihood of edges in each of the eight directions is computed

using blocks around the missing block. In this way, the available

information in a larger area is exploited without increasing the

order of the MRF model (which increases dramatically the com-

putational complexity). To determine the likelihood of edges in

each of the eight directions, edges in the blocks surrounding the

missing block whose directions imply that they pass through the

missing block are determined. That is, we first obtain

and

for every pixel in the blocks to the left, right, top, and bottom of

the missing block. The magnitude and angular direction of the

edge at pixel are

and

where determines if the edge at pixel passes through

the missing block. Since there are eight pixels in the clique,

the value of is rounded to one of the eight directions equally

spaced in the range from 0 to 180 . There is a counter

for each of the eight directions. If the ex-

tension of an edge at pixel belonging to one of the neigh-

boring blocks passes through the missing area, the counter for

that particular direction is incremented by the amount of .

Since the employed edge detector is sensitive to the image noise,

the values of s, are compared to a threshold. If

any of them is less than the threshold, it is set to zero.

There are eight pixels in the clique of each pixel, and eight di-

rections for the detected edges. Each pixel in the clique of a pixel

corresponds to a direction. In our proposed method,

is selected based on the edge counter of the direction corre-

sponding to and , i.e.,

(10)

where is a constant and is the counter corresponding to di-

rection , and direction corresponds to the direction formed

by and . Finally, the second stage of the proposed

error-concealment method can be summarized as follows. 1)

Determine the edges in the neighboring blocks and assign them

to eight equally spaced directions. Compute the counter for each

direction. 2) Use (10) to find a set of weights for each missing

block. 3) Use (9) to obtain an estimate of each missing pixel em-

ploying the weights obtained in the previous step. 4) Iteratively

reestimate the missing pixels using (9) until convergence. Note

that since the cost function is convex, convergence is guaran-

teed.

For intra-coded blocks, the missing pixel values are initially

set to zero and then the MAP estimator, using the adaptiveMRF

and the data of the neighboring blocks, is applied. For missing

inter-coded blocks, an initial estimate are obtained using infor-

mation from the previous frame. Estimates of the prediction er-

rors are found using the adaptive MRF model along with the pre-

diction error signals of the neighboring blocks. The estimated

values are then added to the initial estimates. Since the predic-

tion error signal consists mostly of high-frequency components,

the MAP estimation stage (i.e., the second stage) will improve

the video reconstruction quality, especially around the edges.

IV. COMPUTATIONAL COMPLEXITY

The numbers provided below for computational complexity

correspond to a block and subblock of sizes 16 16 and 8

8, respectively. The computational load of the first stage of the

proposed method consists of those computations required in 1)

estimating the motion, and 2) computing the error of (6). For

motion estimation, we used a spiral search method using an area

of 16 16. This requires approximately 197 000 operations. If

the search for the displacement of each of the subblocks is re-

stricted to the set of candidate motion vectors explained in Sec-

tion III, the required number of operation will reduce to 6100.

The number of operations required to find the total error of (6)

for four blocks is approximately 380.

For the second stage, the computational load consists of

those computations required in adapting the weights of the

MRF model, which are approximately 18 400 operations for

a missing block, assuming four available neighboring blocks.

The estimation of missing pixels using (9) is an iterative

procedure which, for each iteration, requires approximately 30

operations for each missing pixel, or 16 16 30 operations

for each missing block. On average, 80 iterations are required

for the algorithm to converge for a block.

Clearly, the computational load of our error-concealment

method is quite reasonable. Our simulation experiments

confirm that the run time of our method is indeed acceptable.

In fact, real-time decoding (e.g., 10 frames/s for QCIF video

sequences) is still possible on a Pentium 300 MHz PC.

V. EXPERIMENTAL RESULTS

Although the method proposed in this paper is general and

can be applied to any block-based video compression method,

H.263 is used as our video coding framework. QCIF (176

144) video sequences at a temporal resolution of 10 frames/s are

1126 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

Fig. 3. PSNR values for image sequence Foreman with 10% GOB missing for

different concealment methods.

coded at 64 kbps and then decoded in the presence of slice/GOB

errors. The size of the blocks is 16 16, that of the subblock is

88.

To simulate the channel errors, the following tasks are

performed. Coded video information is first grouped into

packets, where each video packet consists of the coded data of

a GOB or a slice. The video packets are then multiplexed with

audio information according to the H.223 standard. We used

an H.223 multiplexing simulator which receives video packets,

simulates audio traffic, applies errors to the multiplexed bit

stream according to an error pattern stored in a file, and

outputs the packets [12]. The error pattern that we employed

corresponds to a mobile channel [13]. Burst errors will most

likely not corrupt two consecutive video packet, since audio

packets are inserted between them. The erroneous bit stream

is decoded such that the effect of errors will appear as missing

slices/GOB’s.

For inter-coded blocks, our error concealment consists of ob-

taining an initial estimate of the missing block using informa-

tion from the previous frame, and finding an estimate of the pre-

diction error using the adaptive MRF model. We compare the

performance of the above method with two other methods: 1)

a replacement method, where a missing block is replaced with

the same block in the previous frame; and 2) a median method,

where the motion vector of a missing block is set to the median

of motion vectors of blocks to the left, above and above-right

of it, and the estimated motion vector is used to obtain a mo-

tion compensation block which serves as the concealment of the

missing block. Here, we consider only the case where a GOB is

placed in each of the video packets. Shown in Fig. 3 are the con-

cealment PSNR values for the three methods mentioned above

for the video sequence Foreman with a 10% packet (GOB) loss

rate. As can be seen, for video sequences with a large amount of

motion like Foreman, the performance of the proposed method

and the median method are close to each other, and both are

better than the replacement method. Although the performance

Fig. 4. An inter-coded frame of the sequence Foreman concealed by the (a)

replacement, (b) median, and (c) proposed methods.

of the above methods is comparable in terms of PSNR, the

video sequences obtained using the proposed method are more

visually appealing than the ones obtained using the other two

methods. Fig. 4(a), (b), and (c) shows the reconstructed video

frame (inter-coded) of the video sequence Foreman. As can be

seen in the images, the replacement method does not generate

good results, especially around the nose area. Moreover, the

concealment result of the proposed method is clearly better than

that of the other two methods.

For intra-coded blocks, we compare the performance of our

method (which consists of estimating the missing block using

the adaptive MRF model) to that of four other efficientmethods:

1) a MAP estimator using a nonadaptive GMRF as the a priori

model where each missing pixel is basically set to the average

of the pixels around it, 2) a suboptimal version of the MAP esti-

mator where a missing pixel is set to the median of pixels around

it [8], 3) the deterministic method proposed in [4], and 4) the

deterministic method proposed in [3]. Although there are other

deterministic methods that are likely to outperform the above

two methods, for example, the ones proposed in [6] and [7], we

restrict our comparison to the above two deterministic methods.

In fact, a direct comparison of our method, which is statistical,

to deterministic methods is difficult because of different design

approaches. Our main aim here is to compare our method to pre-

viously proposed statistical methods.

In this experiment, we employ different video sequences and

different packetization schemes. In the first set of simulation

experiments, we assign a slice which consists of one block,

to a packet. Fig. 5(a) shows a frame of the image sequence

Foreman encoded and decoded using an H.263 compliant coder.

Fig. 5(b)–(g), shows (respectively) the same frame: 1) missing

approximately 20% of the packets (blocks), 2) reconstructed

using the nonadaptive GMRF model, 3) using the suboptimal

MRF model, 4) using the method proposed in [4], 5) using the

method proposed in [3], and 6) using our proposed adaptive

MRF error concealment method. Clearly, our proposed method

performs best in reconstruction quality, particularly in retrieving

the edges and in the areas of the frame that correspond to adja-

cent missing blocks.

In the next set of simulation experiments, we assign the coded

data of a GOB, which consists of 11 blocks, to each of the

video packets. Fig. 6(a) shows a frame of the video sequence

Foreman with two missing packets (GOB’s), at approximately

18% loss rate. Fig. 6(b) and (c) shows the error- concealment

result obtained using the nonadaptive GMRF model and sub-

optimal MRF models, respectively. Fig. 6(d) and (e) shows the

result obtained using the methods proposed in [4] and [3], re-

spectively. Fig. 6(f) shows the result obtained using our adaptive

SHIRANI et al.: CONCEALMENT METHOD FOR VIDEO COMMUNICATIONS 1127

Fig. 5. A frame from the video sequence Foreman: (a) original; (b) missing

blocks; reconstructed using (c) a nonadaptive GMRF model; (d) a suboptimal

MRF model; (e) the method proposed in [4]; (f) the method proposed in [3]; and

(g) our adaptive MRF model.

Fig. 6. A frame from the video sequence Foreman: (a) with missing GOB’s,

reconstructed using (b) a nonadaptive GMRF model; (c) a suboptimal MRF

model; (d) the method proposed in [4]; (e) the method proposed in [3]; and (f)

our adaptive MRF model.

MRF model. By comparing the figures, the superior perfor-

mance of our method becomes obvious. This performance ad-

vantage is demonstrated in the blocks that contain edges. For a

quantitative evaluation, Table I provides the PSNR values of the

above concealment methods (for both packetization cases) for

the video sequence Foreman. The table demonstrates that our

method outperforms the other methods by at least 2 dB.

Finally, we compare the computational complexity of our

method in terms of required number of additions and multiplica-

tions to the methods proposed in [4] and [3]. The total number

of additions and multiplications required for restoring a 16

16 block are approximately 4500 and 20 000 using the method

proposed in [4] and our method, respectively. The method pro-

posed in [3] requires approximately 650 000 additions and mul-

TABLE I

PSNR (dB) COMPARISON OF DIFFERENT

METHODS FOR THE VIDEO SEQUENCE FOREMAN

tiplications to restore a 16 16 block. Therefore, the number

of computations of our method is 4.4:1 larger than that of [4]

and 1:0.03 less than those of [3]. However, our method out-

performs [4] substantially in terms of the quality of the recon-

structed images. This can be seen in Figs. 5 and 6. This quality

improvement justifies, in most applications, the additional com-

putational complexity.

VI. CONCLUSION

This paper introduces a new method of error concealment

for block-based coded video communications over error-prone

channels. The proposed method employs an adaptive MRF

as the image a priori model in a MAP estimation paradigm.

The adaptation enables the estimation procedure to incorporate

more information without increasing the order of the MRF.

The proposed concealment method is able to restore missing

blocks located in smooth and low-frequency areas, as well as

in high-frequency and edge portions of a video frame. Our

concealment method achieves very good computation–per-

formance tradeoffs, and outperforms previously proposed

MRF-based concealment methods.

ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewers for

their insightful comments and constructive suggestions. The au-

thors would also like to gratefully acknowledge the help and

support provided by G. Cote, their colleague in the Signal Pro-

cessing and Multimedia Group at the University of British Co-

lumbia.

REFERENCES

[1] M. Ghanbari and V. Seferidis, “Cell loss concealment in ATM video

codecs,” IEEE Trans.Circuit System Video Technol., vol. 3, pp. 238–247,

June 1993.

[2] W. M.Lam, A. R. Reibman, and B. Liu, “Recovery of lost or erroneously

received motion vectors,” in Proc. Int. Conf. Acoust., Speech, Signal

Processing, vol. V, 1993, pp. 417–420.

[3] Y. Wang, Q. F. Zhu, and L. Shaw, “Maximally smooth image recovery

in transform coding,” IEEE Trans. Commun., vol. 41, pp. 1544–1551,

Oct. 1993.

[4] S. S. Hemami and T. H. Y. Meng, “Transform coded image reconstruc-

tion exploiting interblock correlations,” IEEE Trans. Image Processing,

vol. 4, pp. 1023–1027, July 1995.

[5] H. Sun and W. Kwok, “Concealment of damaged block transform coded

images using projection onto convex sets,” IEEE Trans. Image Pro-

cessing, vol. 4, pp. 470–477, Apr. 1995.

[6] W. Kwok and H. Sun, “Multi-directional interpolation for spatial error

concealment,” IEEE Trans. Consumer Electron., vol. 39, pp. 455–460,

Aug. 1993.

[7] W. Zeng and B. Liu, “Geometric-structure-based error concealment with

novel applications in block-based low-bit-rate coding,” IEEE Trans. Cir-

cuits Syst. Video Technol., vol. 9, pp. 648–665, June 1999.

1128 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000

[8] P. Salama, N. Shroff, E. J. Coyle, and E. J. Delp, “Error concealment

in encoded video streams,” in Signal Recovery Techniques for Image

and Video Compression and Transmission, N. P. Galatsanos and A. K.

Katsaggelos, Eds. Boston, MA: Kluwer Academic, 1998.

[9] R. Talluri, “Error resilient video coding in the MPEG-4 standard,” IEEE

Commun. Mag., vol. 26, pp. 112–119, June 1998.

[10] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distribution, and

the Bayesian restoration of images,” IEEE Trans. Pattern Anal. Machine

Intell., vol. 11, pp. 689–691, 1984.

[11] S. Shirani, F. Kossentini, and R. Ward, “Reconstruction of motion vector

missing macroblocks in H.263 encoded video transmission over lossy

networks,” in Proc. Int. Conf. Image Processing, vol. III, Oct. 1998, pp.

487–491.

[12] G. Sullivan, “A Simple Video Packet Mux Simulator Program for Video

Streams in H.263/M Using AL3 Mux of H.223 Annex B,” ITU-T Study

Group 15, Video Coding Expert Group, Doc. Q15-F-16, Nov. 1996.

[13] Ericsson Sweden, “WCDMA Error Patterns at 64 kb/s,” ITU-T Study

Group 16, Multimedia Terminals and Systems Expert Group, Cannes,

France, June 1998.

Shahram Shirani (S’88–M’89) received the B.Sc.

degree in electrical engineering from Esfahan

University of Technology, Esfahan, Iran, in 1989 and

the M.Sc. degree in biomedical engineering from

Amirkabir University of Technology, Tehran, Iran

in 1994. Currently, he is pursuing the Ph.D. degree

in the Department of Electrical and Computer Engi-

neering, University of British Columbia, Vancouver,

B.C., Canada.

From 1994 to 1996, he was with the Department

of Electrical Engineering, University of Tehran. His

research interests include image and video compression, video communications,

signal processing and ultrasonic imaging.

Faouzi Kossentini (S’89–M’89–SM’98) received

the B.S., M.S., and Ph.D. degrees from the Georgia

Institute of Technology, Atlanta, in 1989, 1990, and

1994, respectively.

During 1995, he was a Research Scientist at

Nichols Research Corporation, Huntsville, AL.

Since January 1996, he has been an Assistant

Professor and then an Associate Professor in the

Department of Electrical and Computer Engineering,

University of British Columbia, vancouver, B.C.,

Canada, where he is involved in research in the areas

of signal processing, communications, and multimedia. He has coauthored

more than 100 journal papers, conference papers, book chapters and patents.

Dr. Kossentini has been active as a Voting Member, and recently as head

of delegation, of the Canadian delegate to ISO/IEC JTC1/SC29, which is re-

sponsible for the standardization of coded representation of audiovisual, mul-

timedia, and hypermedia information. In particular, he has participated in most

current JBIG/JPEG and MPEG-4 standardization activities. He has also partic-

ipated in most current ITU-T low bit rate video coding standardization activi-

ties. He has served as a technical area coordinator and member of the technical

program committee of ICIP-1997, and as a Member of the technical program

committee of ISCAS-1999. He is also the Vice General Chairman of ICIP-2000.

Dr. Kossentini is currently an Associate Editor for the IEEE TRANSACTIONS ON

IMAGE PROCESSING. He is also an Associate Editor for the IEEE TRANSACTIONS

ON MULTIMEDIA.

Rabab Ward (F’98) received the B.Sc. degree in

electrical engineering from University of Cairo,

Egypt, in 1966 and the M.Sc. and the Ph.D. in elec-

trical and computer engineering from University of

California, Berkeley, in 1969 and 1972, respectively.

She is a Professor in the Electrical and Computer

Engineering Department, University of British Co-

lumbia, Vancouver, B.C., Canada, and is the Director

of the Center for Integrated Computer Systems Re-

search there. Her research interests are mainly in the

areas of signal processing and image processing. She

has made contributions in the areas of signal detection, image encoding, com-

pression, recognition, restoration and enhancement, and their applications to

infant cry signals, cable TV, HDTV, medical images, and astronomical images.

She holds five patents related to cable television picture monitoring, measure-

ment and noise reduction.

Dr. Ward is a fellow of the EIC and the Royal Society of Canada. She is

currently serving as the General Chair of ICIP’2000 to be held in Vancouver.

A review of temporal video error concealment techniques and their suitability for HEVC and VVC

Article

Full-text available

Jan 2021
MULTIMED TOOLS APPL

Despite of the recent progresses in reliable and high bandwidth communication, packet loss is still probable and needs special attention in real-time video streaming applications. Congestion and bit error rate, which sometimes are more than the protection capability of the channel codes, are the sources of packet loss in video communication. One common approach to deal with video packet loss is to use error concealment techniques, which estimate the non-received data as close as possible to the actual data. This article reviews the temporal video error concealment methods that have been developed over the past 30 years. The techniques are categorized into 8 groups, and the methods are covered with enough details. The strengths and weaknesses of the 8 groups are also tabulated, and some suggestions for future work and open areas for research are provided.

20376 ftp

Data

Full-text available

Aug 2015

Novel Approach to Estimate Missing Data Using Spatio-Temporal Estimation Method

Article

Full-text available

Apr 2016

With advancement of wireless technology and the processing power in mobile devices, every handheld device supports numerous video streaming applications. Generally, user datagram protocol (UDP) is used in video transmission technology which does not provide assured quality of service (QoS). Therefore, there is need for video post processing modules for error concealments. In this paper we propose one such algorithm to recover multiple lost blocks of data in video. The proposed algorithm is based on a combination of wavelet transform and spatio-temporal data estimation. We decomposed the frame with lost blocks using wavelet transform in low and high frequency bands. Then the approximate information (low frequency) of missing block is estimated using spatial smoothening and the details (high frequency) are added using bidirectional (temporal) predication of high frequency wavelet coefficients. Finally inverse wavelet transform is applied on modified wavelet coefficients to recover the frame. In proposed algorithm, we carry out an automatic estimation of missing block using spatio-temporal manner. Experiments are carried with different YUV and compressed domain streams. The experimental results show enhancement in PSNR as well as visual quality and cross verified by video quality metrics (VQM).

CECNN: A Convergent Error Concealment Neural Network for Videos

Conference Paper

Full-text available

Jan 2022

Real-Time Concealment of Whole-Frame Losses with Generalized Regression Neural Networks for Conversational Video Communications over Lossy Channels

Conference Paper

Full-text available

Sep 2006

Video communication over error-prone channels requires well-designed loss concealment techniques to develop the quality of services without using extra bandwidth. Modern video coding standards allow to further reduction in the bit rate and in very low bit-rate video transmission, absence of one or more video packets may lead in loss of one or more consecutive video frames. Although, several spatial and temporal error concealment algorithms are introduced in standard video codecs and literatures, none of them is capable to apply for whole-frame restoration because of the fact that no neighboring macroblock is available to employ intra or inter frame error resilience approaches. In this paper, a real-time adaptive concealment method based on generalized regression neural networks is proposed to restore lost predictively coded pictures in conversational video streams and suppress error propagation through them. Our simulations result up to 4.5 dB average improvement in quality over conventional frame copy approach.

Joint Source-Channel Coding of video sequences

Article

Apr 2005

Georgia Feideropoulou

We propose a joint source-channel coding scheme, developed for video sequences, which consists of a vector quantization based on lattice constellations and a linear labelling minimizing simultaneously the source and the channel distortion. The linear labelling has already been proved to minimize the channel distortion on binary symmetric channels and the linear transforms based on lattice constellations of maximum diversity to minimize, at the same time, the distortion of Gaussian sources. We study the dependencies between the wavelets coefficients of a $t+2D$ video decomposition in order to efficiently exploit the linear transforms developed for Gaussian sources. As the source distribution of the subbands is not Gaussian we present the necessary modifications in order to obtain a robust coding scheme. We propose a stochastic model to capture the dependencies between the wavelets coefficients and we use it to build an optimal mean square predictor for missing coefficients. We present two applications of this predictor on the transmission over packet networks: a quality enhancement technique for resolution scalable video bitstreams and an error concealment method. We develop a robust joint source-channel coding scheme for transmission of video sequences over a Gaussian channel using uncoded and coded index assignment via Reed-Muller codes. We investigate the conditions requiring the use of a coded index assignment and we prove its superiority compared to an unstructured vector quantizer. For a transmission over a flat-Rayleigh channel, we develop a robust coding scheme using our vector quantizer followed by a rotation matrix.

Spatial subsampling-based multiple description video coding with adaptive temporal-spatial error concealment

Conference Paper

May 2015

Spatial error concealment algorithm using minimum variance of edge of border pixels

Article

Dec 2010

To overcome the degradation of video quality in intra frame caused by transmission errors, a novel spatial error concealment algorithm using minimum variance of edge of border pixels is proposed. For every lost block, the neighboring pixel which is strongest relevant to border pixel is calculated and the edge direction is estimated. Then the minimum variance of edge of border pixels is employed to recover the lost pixel, which efficiently exploits the estimated edge direction and adjacent border pixels. The simulation results show that compared with the conventional algorithms, the proposed algorithm can highly improve the quality of reconstructed video and obtain a gain of about 0.2~2.5 dB in PSNR for different rates of lost block or different video sequences.

A new error concealment algorithm by using the edge enhancement in the Harr transform domain

Article

Jan 2014

The technique of error concealment can estimate the pixel values in a lost block during video transmission and improve the image quality. However, the existing error concealment algorithms have a shortcoming of over-smoothing and lose some image edge information inside the lost block. To overcome this shortcoming, a new error concealment algorithm is proposed in this paper. In this algorithm, it first utilizes the error concealment method in H.264 to make initial concealment. After that, the image is transformed in the Harr wavelet domain to further enhance the edge information. After the reverse wavelet transform, the image is finally enhanced by using anisotropic diffusion. Simulation results demonstrate that the proposed algorithm can well keep the image edge information in the edge concealment, avoiding the problem of over-smooth and edge information losses in the conventional error concealment algorithms.

Error-Resilient Coding and Decoding Strategies for Video Communication

Chapter

Dec 2007

This chapter discusses the error-resilient coding and decoding strategies for video communication. It is important to understand that video can benefit significantly if the transmitter can be sure that the video will be delivered reliably. Typically, the introduction of error-resilience tools in the video coding layer is very costly in terms of compression efficiency. The overhead is in general much better spent in lower layers of the protocol stack. Nevertheless, there exist applications in which errors are inevitable. If the video encoder is not aware of distortions on the transmission link, this in general leads to dramatic quality degradations due to instantaneous errors as well as spatial-temporal error propagation. Whereas the effect of instantaneous errors can be decreased by the use of specific packetization modes, the usually more severe effect of error propagation can be reduced by the application of more frequent intra-information, interactive error control, or a combination of both.

Stochastic relaxation, Gibbs distrib. Bayesian restoration of img

Article

Jan 1984

Error concealment in MPEG video streams over ATM networks

Article

Jun 2000

When transmitting compressed video over a data network, one has to deal with how channel errors affect the decoding process. This is particularly a problem with data loss or erasures. In this paper we describe techniques to address this problem in the context of asynchronous transfer mode (ATM) networks. Our techniques can be extended to other types of data networks such as wireless networks. In ATM networks channel errors or congestion cause data to be dropped, which results in the loss of entire macroblocks when MPEG video is transmitted. In order to reconstruct the missing data, the location of these macroblocks must be known. We describe a technique for packing ATM cells with compressed data, whereby the location of missing macroblocks in the encoded video stream can be found. This technique also permits the proper decoding of correctly received macroblocks, and thus prevents the loss of ATM cells from affecting the decoding process. The packing strategy can also be used for wireless or other types of data networks. We also describe spatial and temporal techniques for the recovery of lost macroblocks. In particular, we develop several optimal estimation techniques for the reconstruction of missing macroblocks that contain both spatial and temporal information using a Markov random field model. We further describe a sub-optimal estimation technique that can be implemented in real time

Concealment of damaged block transform coded images using projection onto convex sets

Article

Apr 1995

An algorithm for lost signal restoration in block- based still image and video sequence coding is presented. Prob- lems arising from imperfect transmission of block-coded images result in lost blocks. The resulting image is flawed by the absence of square pixel regions that are notably perceived by human vision, even in real-time video sequences. Error concealment is aimed at masking the effect of missing blocks by use of tem- poral or spatial interpolation to create a subjectively acceptable approximation to the true error-free image. This paper presents a spatial interpolation algorithm that addresses concealment of lost image blocks using only intra-frame information. It attempts to utilize spatially correlated edge information from a large local neighborhood of surrounding pixels to restore missing blocks. The algorithm is a Gerchberg-type spatial domainkpectral domain constraint-satisfying iterative process, and may be viewed as an alternating projections onto convex sets method. In this paper, we introduce a proposed method of error concealment based on projections onto convex sets (POCS) (7). The POCS method of image restoration attempts to satisfy a priori characteristics typical of most natural video images. The algorithm iterates between satisfying spatial domain and spectral domain constraints, much like Gerchberg's method (91. We investigate the proposed algorithm by performing simulation on typical video images. The algorithm is judged on how well it performs with varying degrees of error localization (i.e. how large the damaged region is). Typical rates of algo- rithm convergence are determined. An objective figure of merit using peak-signal-to-noise ratio will assess the improvement gains over simpler methods of concealment. Pictorial results will demonstrate the subjective quality of the restoration. 11. BACKGROUND

Geman, D.: Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6(6), 721-741

Article

Nov 1984

We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

Reconstruction of motion vector missing macroblocks in H.263 encoded video transmission over lossy networks

Conference Paper

Nov 1998

Errors caused by loss of coded data can seriously affect an H.263 decoded image sequence. Several scenarios may occur that include: (1) loss of macroblocks in I or P frames, and (2) loss of motion vectors of macroblocks in P frames. The missing macroblocks in I and P frames can be reasonably reconstructed by exploiting the correlation between adjacent macroblocks. Existing methods which reconstruct the motion vector of a macroblock rely on existing motion vectors of surrounding macroblocks, and the results are not always satisfactory. A novel reconstruction technique for restoration of macroblocks with missing motion vectors is proposed. This method exploits the image continuity inside and across the borders of the macroblocks. Simulation results indicate that the performance of the proposed algorithm is good, both subjectively and objectively

Recovery of lost or erroneously received motion vectors

Conference Paper

May 1993
Acoust Speech Signal Process

A technique using boundary matching to compensate for lost or erroneously received motion vectors in motion-compensated video coding is proposed. This technique, called the boundary matching algorithm, produces noticeably better results than those reported previously. It is first assumed that the displaced frame differences have no error. Then, this assumption is relaxed by proposing an algorithm (the extended boundary matching algorithm or EBMA) which can recover both the missing displaced frame differences and the missing motion vectors. The resulting images obtained using these methods and some other methods are compared. The images obtained clearly show that better image quality can be obtained by EBMA

Transform coded image reconstruction exploiting interblock correlation

Article

Aug 1995

Transmission of still images and video over lossy packet networks presents a reconstruction problem at the decoder. Specifically, in the case of block-based transform coded images, loss of one or more packets due to network congestion or transmission errors can result in errant or entirely lost blocks in the decoded image. This article proposes a computationally efficient technique for reconstruction of lost transform coefficients at the decoder that takes advantage of the correlation between transformed blocks of the image. Lost coefficients are linearly interpolated from the same coefficients in adjacent blocks subject to a squared edge error criterion, and the resulting reconstructed coefficients minimize blocking artifacts in the image while providing visually pleasing reconstructions. The required computational expense at the decoder per reconstructed block is less than 1.2 times a non-recursive DCT, and as such this technique is useful for low power, low complexity applications that require good visual performance

Geometric-structure-based error concealment with novel applications in block-based low-bit-rate coding

Article

Jul 1999

This paper first proposes a computationally efficient spatial directional interpolation scheme, which makes use of the local geometric information extracted from the surrounding blocks. The proposed error-concealment scheme produces results that are superior to those of other approaches, in terms of both peak signal-to-noise ratio and visual quality. Then a novel approach that incorporates this directional spatial interpolation at the receiver is proposed for block-based low-bit-rate coding. The key observation is that the directional spatial interpolation at the receiver can reconstruct faithfully a large percentage of the blocks that are intentionally not sent. A rate-distortion optimal way to drop the blocks is shown. The new approach can be made compatible with standard JPEG and MPEG decoders. The block-dropping approach also has an important application for dynamic rate shaping in transmitting precompressed videos over channels of dynamic bandwidth. Experimental results show that the proposed coding and rate-shaping systems can provide significant subjective and objective gains over conventional approaches

Cell-loss concealment in ATM Video Codecs

Article

Jul 1993

Methods for the interpolation of lost cells in asynchronous-transfer-mode (ATM) networks are studied. It is shown that use of motion-compensated previous frames gives the best results. The quality of the interpolated pictures improves if the motion vectors truly represent the actual motion in the scene. This is only possible with a two-layer coding scheme, where the motion vectors can be delivered to the decoder through the base-layer guaranteed channel. In derivation of the motion vectors at the encoder, use of uncoded input picture frames outperforms the conventional method of motion extraction from the previous coded pictures, despite the lower bit rate of the latter to the former. Depending on the quality of the base layer and the scene activity, the signal-to-noise ratio (SNR) in the cell-loss-interpolated areas can be improved by up to 10 dB

Error-resilient video coding in the ISO MPEG-4 standard

Article

Jul 1998

Raj Talluri

This article describes error resilience aspects of the video coding techniques that are standardized in the ISO MPEC-4 standard. It begins with a description of the general problems in robust wireless video transmission. The specific tools adopted into the ISO MPEG-4 standard to enable the communication of compressed video data over noisy wireless channels are presented in detail. These techniques include resynchronization strategies, data partitioning, reversible variable length codes (VLCs), and header extension codes. An overview of the evolving ISO MPEG-4 standard and its current status are described

A concealment method for video communications in an error-prone environment

Abstract and Figures

Recommended publications

McMaster Engineering: Canada Excellence Research Chairs Faculty Position Available...

Black Academic Excellence Recruitment Opportunities at McMaster Engineering

McMaster Engineering is hiring widely for faculty positions

Empowering older adults to live and age on

Video coding based on true motion estimation

Analysis and motion estimation strategies for frame and video object coding

Robust Transmission of video sequence using double-vector motion compensation

Reduction of blocking artifacts in image and video coding