ArticlePDF Available

A concealment method for video communications in an error-prone environment

Authors:

Abstract and Figures

In this paper, we propose a two-stage error-concealment method for block-based compressed video which was transmitted in an error-prone environment. In the first stage, we obtain initial estimates of the missing blocks. If the motion vectors associated with the missing blocks are available, motion compensation is used to provide good estimates. Otherwise, a novel algorithm which preserves image continuity is used to estimate the blocks. In the second stage, a maximum a posteriori (MAP) estimator, which employs an adaptive Markov random field (MRF) as the image a priori model is used to improve the video reconstruction quality. The adaptive model enables the estimation to incorporate information embedded not only in the immediate neighborhood pixels but also in a wider neighborhood into the reconstruction procedure without increasing the order of the MRF model. The proposed concealment method achieves very good computation-performance tradeoffs, as demonstrated via experimental results
Content may be subject to copyright.
1122 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
A Concealment Method for Video Communications
in an Error-Prone Environment
Shahram Shirani, Member, IEEE, Faouzi Kossentini, Senior Member, IEEE, and Rabab Ward, Fellow, IEEE
Abstract—In this paper, we propose a two-stage error-conceal-
ment method for block-based compressed video which was trans-
mitted in an error-prone environment. In the first stage, we obtain
initial estimates of the missing blocks. If the motion vectors associ-
ated with the missing blocks are available, motion compensation
is used to provide good estimates. Otherwise, a novel algorithm
which preserves image continuity is used to estimate the blocks. In
the second stage, a maximum a posteriori (MAP) estimator, which
employs an adaptive Markov random field (MRF) as the image a
priori model, is used to improve the video reconstruction quality.
The adaptive model enables the estimation to incorporate infor-
mation embedded not only in the immediate neighborhood pixels
but also in a wider neighborhood into the reconstruction proce-
dure without increasing the order of the MRF model. The pro-
posed concealment method achieves very good computation–per-
formance tradeoffs, as demonstrated via experimental results.
Index Terms—Error concealment, image reconstruction,
Markov random fields.
I. INTRODUCTION
DIGITAL image and video signals require very high bit
rates, thus, compression of such signals before their trans-
mission is necessary. Communication channels are not error
free and, consequently, the encoded bit streams are vulnerable
to transmission errors—usually causing loss of blocks of data
and/or loss of synchronization. Despite the various methods that
have been proposed to combat or localize the effects of channel
errors, the quality of a decoded video sequence may degrade
significantly because of the residual errors. Error concealment
intends to conceal the effects of such errors by exploiting redun-
dancies in the video signal and limitations of the human visual
system, without requiring additional information.
Temporal concealment methods are usually used for error
concealment in inter-coded frames. Data of an inter-coded block
are composed of the motion vector and the DCT coefficients of
the prediction error. If the motion vectors are received without
errors, the missing blocks are set to their corresponding mo-
tion compensated blocks. Since in many of the video compres-
sion algorithms (e.g., H.263) the coded motion vectors and the
DCT coefficients are interleaved, the loss of data of a block
usually results in the loss of both the motion vectors and the
Manuscript received May 4, 1999; revised November 30, 1999. This work
was supported by MDSI, BC ASI, and NSERC.
The authors are with the Department of Electrical and Computer Engineering,
University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
Publisher Item Identifier S 0733-8716(00)04337-7.
DCT coefficients.1Therefore, most of the proposed temporal
concealment methods first estimate the motion vector associ-
ated with a missing block using the motion vectors of adjacent
blocks [1], [2]. The estimated motion vector is then used to find
a block in the previous frame which yields the restoration of
the missing block. One problem associated with these methods
is that if the adjacent blocks are coded in a nonpredictive way
(e.g., intra-coded), there would not be any data available to esti-
mate the missing motion vectors. Thus, ad hoc assumptions for
estimating the missing motion vector should be made, and the
results may be unreliable. Moreover, if the missing block lies
on the boundary of two objects moving in opposite directions,
such methods will perform quite poorly.
Spatial error-concealment methods restore the missing
blocks using the information in the current frame. In [3],
to restore the missing data, a measure of variations (e.g.,
gradient or Laplacian) between adjacent pixels is minimized.
The underlying smoothness assumption of this method limits
its ability in restoring image details. In [4], each pixel in a
damaged block is interpolated from the corresponding pixels in
its four neighboring blocks such that the total squared border
error is minimized. In [5] and [6], the missing information
is interpolated utilizing spatially correlated edge information
from a large local neighborhood. Note that although these
edge-based methods are generally more accurate than other
approaches, they are computationally more intensive. In [7],
a computationally simple, spatial directional interpolation
scheme has been proposed. The two nearest surrounding
layers of pixels of a missing block are converted into a binary
pattern to reveal the local geometrical structure. Then, the
missing pixels are interpolated in a way to preserve the local
geometrical structures. In the statistical error concealment
methods, for example the one proposed in [8], it is assumed
that the pixel values in an image or video signal are realizations
of an underlying statistical model (e.g., MRF model). The sta-
tistical approaches of spatial error concealment are expected to
outperform the deterministic ones, as they provide a systematic
way for incorporating the a priori information about the video
signal in the restoration procedure.
Previous spatial error-concealment methods employing the
MRF model usually yield blurry images with a significant loss
of detail in the high frequency or edge portions of the image.
This is due to 1) the type of the MRF selected as the image
model, and 2) the fact that the amount of information that is
1Recognizing the need to improve the concealment capabilities, in the new
generation of video compression standards (e.g., MPEG 4, H.263++), the mo-
tion information is separated from other information through partitioning of the
coded data.
0733–8716/00$10.00 © 2000 IEEE
SHIRANI et al.: CONCEALMENT METHOD FOR VIDEO COMMUNICATIONS 1123
used in the reconstruction process is often restricted to a single-
pixel wide region around the erroneous area. Incorporating more
pixels in order to enhance restoration of edges and high-fre-
quency parts would usually require a higher order MRF model.
However, this is computationally expensive, as the complexity
grows exponentially with the order of the MRF model.
In this paper, we propose a two-stage error-concealment
method for compressed video. In the first stage, if applicable,
we use the information in the previous frame to obtain initial
estimates of the missing blocks. If the motion vectors of the
missing blocks are available, motion compensation is used to
provide the estimates. Otherwise, an algorithm which preserves
image continuity is employed to compute the initial estimates.
In the second stage, a MAP estimator is used for refinement of
the initial estimates. The MAP estimator employs an adaptive
MRF as the image a priori model. The proposed adaptive MRF
takes into account the local image characteristics embedded
not only in the immediate neighborhood pixels of the damaged
area but also in a wider neighborhood without a dramatic in-
crease in computational complexity. Our concealment method
improves on the existing statistical methods in that it yields
good reconstruction performance regardless of the content of
the missing blocks.
To make the compressed data more error resilient, most of
the standard-compliant video compression systems partitioned
a video frame into Groups of Blocks (GOB’s) or “slices,” which
are coded independently. Therefore, the output bit stream usu-
ally consists of segments separated with markers, where each
segment corresponds to the coded data of the blocks in a GOB or
a slice. When channel errors occur, the decoder usually discards
the erroneous data between two markers surrounding the erro-
neous data, effectively discarding the GOB or the slice. Then,
loss of data of a slice does not affect the rest of the compressed
video sequence. In our work, it is assumed that the frames of
the coded video sequence are partitioned into GOB’s or slices
and, thus, the missing data belong to blocks of a GOB or a slice.
Moreover, it is assumed that the decoder knows the locations of
the missing blocks. This information (e.g., the checksum infor-
mation) can be obtained from the network or it can be inferred,
for example, by detecting the semantic or syntactic violations as
a result of errors [9].
The rest of this paper is organized as follows. In Section II,
MAP estimation of missing data is briefly reviewed. Section III
presents the proposed error concealment method. Section IV
discusses the computational complexity of the proposed
method. Sections V and VI present our experimental results
and conclusions, respectively.
II. MAP ESTIMATION OF MISSING DATA
Let the corrupted image be represented by the vector , while
the decompressed image without errors can be represented by
. Using MAP estimation, the decompressed image estimate is
given by
Using the Bayes’ rule we, obtain
(1)
where the term has been dropped because it is inde-
pendent of . The corrupted image can be expressed as
, where is the transform that converts the decompressed
image into the corrupted one .2The conditional probability
of the corrupted image can be then written as follows:
(2)
Substituting (2) in (1), the MAP estimation becomes
(3)
where .
For image a priori model , we use the MRF.
According to the Hammersley–Clifford theorem, every MRF
should have a Gibbs distribution [10] in the following form:
(4)
where is a normalizing constant, is called a potential
function and is a function of a local group of pixels called
clique, and denotes the set of all cliques throughout the image.
The potential function characterizes the relationship between a
group of pixels by assigning larger costs to configurations of
pixels which are less likely to occur. The choice of the poten-
tial function impacts substantially the performance of the image
model. The function should be convex in order
to have an easily obtainable global minimum. Otherwise, local
minima will be present, and the function must then be mini-
mized via a computationally expensive technique such as simu-
lated annealing. Commonly, the potential functions are selected
to be of the form
(5)
where is a coefficient vector, is the vector of pixels in the
clique , and is the cost function. The convexity propertyof
the cost function insures that the remains convex.
The coefficients are usually selected such that provides
an approximation of the first or second derivative of the image
at each pixel. For the special case of , the model is
called a Gauss–Markov random field (GMRF).
III. PROPOSED METHOD
As mentioned earlier, our method consists of two stages. The
purpose of the first stage of our method is to obtain an initial es-
timate of the missing block using information from the previous
frame. If the missing block has been intra-coded, the initial esti-
mate is set to zero. If the missing block has been inter-coded, and
its motion vector is received correctly (e.g., because the encoded
2See [8] for a more detailed discussion of the transform.
1124 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
(a)
(b)
Fig. 1. (a) The four subblocks, (b) their corresponding subblocks in the
previous frame, and the blocks connected to them.
data have been partitioned), then the estimate is set to the corre-
sponding motion compensation block. When the motion vector
is not available, we find an estimate of the missing block such
that image continuity inside the block and across its boundaries
is preserved [11]. To do this, subblocks adjacent to the missing
block, i.e., , and in Fig. 1(a), are considered. First,
for each of these subblocks, a corresponding subblock in the pre-
vious frame is determined. The corresponding subblock is found
by searching a small area around the point corresponding to the
center of each of the subblocks in the previous
frame. The sum of absolute differences is used as the measure
of similarity. The four subblocks and their corre-
sponding subblocks in the previous frame are
shown in Fig. 1(a) and (b). For example, is the subblock cor-
responding to . Then, four blocks, namely , and
, which are connected to , and (respectively),
are determined. To obtain an initial estimate of the missing block
that smoothly connects to the rest of the image, a block from the
above four blocks that minimizes the squared sums of border er-
rors, between the estimated block and its adjacent above and left
blocks, is selected. Thus,
where
(6)
Each of the border errors and is defined in terms of pixels
by
and
Fig. 2. A pixel, its clique
c
, and the eight directions. The complement of the
clique
c
is the dark area.
where the vector consists of the bottom line pixels of the
block to the top of the missing block, and consists of the
right column pixels of the block to the left of missing block.
The vectors and are those elements of the estimated block
that correspond to the pixels in its top row and left column,
respectively. To ensure the online applicability (e.g., conceal-
ment while decoding the blocks in raster scan), only subblocks
to the left and above the missing block are used. If the condi-
tion of online concealment is relaxed, the same method can be
extended to include the subblocks of the blocks to the right and
below of the missing block.
The estimation of motion of the subblocks has a high compu-
tational overhead which can possibly introduce unacceptable re-
quirements on decoder. The computational overhead can be re-
duced if the search for displacement of each of the subblocks is
restricted to a set of candidate motion vectors. This is a decoder
option and can be used to trade performance against computa-
tional complexity. The set consists of the motion vector of the
block corresponding to the missing block in the previous frame,
the motion vectors of available neighboring blocks, the median
of the motion vectors of available neighboring blocks, the av-
erage of the motion vectors of available neighboring blocks, and
the zero motion vector [2].
In the second stage of our proposed error-concealment
method, the information around each missing block is used
to refine the initial estimate. This stage is based on a MAP
estimator. We consider a GMRF model with an eight-pixel
clique around each pixel as the image a priori model (see
Fig. 2). The reason for selecting an eight-pixel clique in the
manner shown in Fig. 2 will become clear later. The potential
function of (5) is selected as
(7)
where is the clique, is the weight assigned to the
difference between the values of the pixel in position and
the pixel in its clique at position and .
Combining the MAP estimator of (3) with the image model
[(4) and (7)], the restoration of missing data eventuates in the
following minimization problem:
(8)
SHIRANI et al.: CONCEALMENT METHOD FOR VIDEO COMMUNICATIONS 1125
where is the set of all missing pixels in the frame. Since
is a convex function, the above minimization
problem yields a unique global solution. In fact, the estimated
value of a pixel is given by
(9)
where is the clique and is its complement shown in Fig. 2.
In our adaptive MRF model, the weight corresponding to the
difference between a pixel and one of the pixels in its clique
[in (9)] is selected adaptively, based on the likelihood
of an edge in the direction of the subject pair of pixels. The ra-
tionale behind this selection is to weigh more the difference be-
tween the pixels in that direction which will cause the values
of the pixels in that direction to get closer to each other. The
likelihood of edges in each of the eight directions is computed
using blocks around the missing block. In this way, the available
information in a larger area is exploited without increasing the
order of the MRF model (which increases dramatically the com-
putational complexity). To determine the likelihood of edges in
each of the eight directions, edges in the blocks surrounding the
missing block whose directions imply that they pass through the
missing block are determined. That is, we first obtain
and
for every pixel in the blocks to the left, right, top, and bottom of
the missing block. The magnitude and angular direction of the
edge at pixel are
and
where determines if the edge at pixel passes through
the missing block. Since there are eight pixels in the clique,
the value of is rounded to one of the eight directions equally
spaced in the range from 0 to 180 . There is a counter
for each of the eight directions. If the ex-
tension of an edge at pixel belonging to one of the neigh-
boring blocks passes through the missing area, the counter for
that particular direction is incremented by the amount of .
Since the employed edge detector is sensitive to the image noise,
the values of s, are compared to a threshold. If
any of them is less than the threshold, it is set to zero.
There are eight pixels in the clique of each pixel, and eight di-
rections for the detected edges. Each pixel in the clique of a pixel
corresponds to a direction. In our proposed method,
is selected based on the edge counter of the direction corre-
sponding to and , i.e.,
(10)
where is a constant and is the counter corresponding to di-
rection , and direction corresponds to the direction formed
by and . Finally, the second stage of the proposed
error-concealment method can be summarized as follows. 1)
Determine the edges in the neighboring blocks and assign them
to eight equally spaced directions. Compute the counter for each
direction. 2) Use (10) to find a set of weights for each missing
block. 3) Use (9) to obtain an estimate of each missing pixel em-
ploying the weights obtained in the previous step. 4) Iteratively
reestimate the missing pixels using (9) until convergence. Note
that since the cost function is convex, convergence is guaran-
teed.
For intra-coded blocks, the missing pixel values are initially
set to zero and then the MAP estimator, using the adaptiveMRF
and the data of the neighboring blocks, is applied. For missing
inter-coded blocks, an initial estimate are obtained using infor-
mation from the previous frame. Estimates of the prediction er-
rors are found using the adaptive MRF model along with the pre-
diction error signals of the neighboring blocks. The estimated
values are then added to the initial estimates. Since the predic-
tion error signal consists mostly of high-frequency components,
the MAP estimation stage (i.e., the second stage) will improve
the video reconstruction quality, especially around the edges.
IV. COMPUTATIONAL COMPLEXITY
The numbers provided below for computational complexity
correspond to a block and subblock of sizes 16 16 and 8
8, respectively. The computational load of the first stage of the
proposed method consists of those computations required in 1)
estimating the motion, and 2) computing the error of (6). For
motion estimation, we used a spiral search method using an area
of 16 16. This requires approximately 197 000 operations. If
the search for the displacement of each of the subblocks is re-
stricted to the set of candidate motion vectors explained in Sec-
tion III, the required number of operation will reduce to 6100.
The number of operations required to find the total error of (6)
for four blocks is approximately 380.
For the second stage, the computational load consists of
those computations required in adapting the weights of the
MRF model, which are approximately 18 400 operations for
a missing block, assuming four available neighboring blocks.
The estimation of missing pixels using (9) is an iterative
procedure which, for each iteration, requires approximately 30
operations for each missing pixel, or 16 16 30 operations
for each missing block. On average, 80 iterations are required
for the algorithm to converge for a block.
Clearly, the computational load of our error-concealment
method is quite reasonable. Our simulation experiments
confirm that the run time of our method is indeed acceptable.
In fact, real-time decoding (e.g., 10 frames/s for QCIF video
sequences) is still possible on a Pentium 300 MHz PC.
V. EXPERIMENTAL RESULTS
Although the method proposed in this paper is general and
can be applied to any block-based video compression method,
H.263 is used as our video coding framework. QCIF (176
144) video sequences at a temporal resolution of 10 frames/s are
1126 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
Fig. 3. PSNR values for image sequence Foreman with 10% GOB missing for
different concealment methods.
coded at 64 kbps and then decoded in the presence of slice/GOB
errors. The size of the blocks is 16 16, that of the subblock is
88.
To simulate the channel errors, the following tasks are
performed. Coded video information is first grouped into
packets, where each video packet consists of the coded data of
a GOB or a slice. The video packets are then multiplexed with
audio information according to the H.223 standard. We used
an H.223 multiplexing simulator which receives video packets,
simulates audio traffic, applies errors to the multiplexed bit
stream according to an error pattern stored in a file, and
outputs the packets [12]. The error pattern that we employed
corresponds to a mobile channel [13]. Burst errors will most
likely not corrupt two consecutive video packet, since audio
packets are inserted between them. The erroneous bit stream
is decoded such that the effect of errors will appear as missing
slices/GOB’s.
For inter-coded blocks, our error concealment consists of ob-
taining an initial estimate of the missing block using informa-
tion from the previous frame, and finding an estimate of the pre-
diction error using the adaptive MRF model. We compare the
performance of the above method with two other methods: 1)
a replacement method, where a missing block is replaced with
the same block in the previous frame; and 2) a median method,
where the motion vector of a missing block is set to the median
of motion vectors of blocks to the left, above and above-right
of it, and the estimated motion vector is used to obtain a mo-
tion compensation block which serves as the concealment of the
missing block. Here, we consider only the case where a GOB is
placed in each of the video packets. Shown in Fig. 3 are the con-
cealment PSNR values for the three methods mentioned above
for the video sequence Foreman with a 10% packet (GOB) loss
rate. As can be seen, for video sequences with a large amount of
motion like Foreman, the performance of the proposed method
and the median method are close to each other, and both are
better than the replacement method. Although the performance
Fig. 4. An inter-coded frame of the sequence Foreman concealed by the (a)
replacement, (b) median, and (c) proposed methods.
of the above methods is comparable in terms of PSNR, the
video sequences obtained using the proposed method are more
visually appealing than the ones obtained using the other two
methods. Fig. 4(a), (b), and (c) shows the reconstructed video
frame (inter-coded) of the video sequence Foreman. As can be
seen in the images, the replacement method does not generate
good results, especially around the nose area. Moreover, the
concealment result of the proposed method is clearly better than
that of the other two methods.
For intra-coded blocks, we compare the performance of our
method (which consists of estimating the missing block using
the adaptive MRF model) to that of four other efficientmethods:
1) a MAP estimator using a nonadaptive GMRF as the a priori
model where each missing pixel is basically set to the average
of the pixels around it, 2) a suboptimal version of the MAP esti-
mator where a missing pixel is set to the median of pixels around
it [8], 3) the deterministic method proposed in [4], and 4) the
deterministic method proposed in [3]. Although there are other
deterministic methods that are likely to outperform the above
two methods, for example, the ones proposed in [6] and [7], we
restrict our comparison to the above two deterministic methods.
In fact, a direct comparison of our method, which is statistical,
to deterministic methods is difficult because of different design
approaches. Our main aim here is to compare our method to pre-
viously proposed statistical methods.
In this experiment, we employ different video sequences and
different packetization schemes. In the first set of simulation
experiments, we assign a slice which consists of one block,
to a packet. Fig. 5(a) shows a frame of the image sequence
Foreman encoded and decoded using an H.263 compliant coder.
Fig. 5(b)–(g), shows (respectively) the same frame: 1) missing
approximately 20% of the packets (blocks), 2) reconstructed
using the nonadaptive GMRF model, 3) using the suboptimal
MRF model, 4) using the method proposed in [4], 5) using the
method proposed in [3], and 6) using our proposed adaptive
MRF error concealment method. Clearly, our proposed method
performs best in reconstruction quality, particularly in retrieving
the edges and in the areas of the frame that correspond to adja-
cent missing blocks.
In the next set of simulation experiments, we assign the coded
data of a GOB, which consists of 11 blocks, to each of the
video packets. Fig. 6(a) shows a frame of the video sequence
Foreman with two missing packets (GOB’s), at approximately
18% loss rate. Fig. 6(b) and (c) shows the error- concealment
result obtained using the nonadaptive GMRF model and sub-
optimal MRF models, respectively. Fig. 6(d) and (e) shows the
result obtained using the methods proposed in [4] and [3], re-
spectively. Fig. 6(f) shows the result obtained using our adaptive
SHIRANI et al.: CONCEALMENT METHOD FOR VIDEO COMMUNICATIONS 1127
Fig. 5. A frame from the video sequence Foreman: (a) original; (b) missing
blocks; reconstructed using (c) a nonadaptive GMRF model; (d) a suboptimal
MRF model; (e) the method proposed in [4]; (f) the method proposed in [3]; and
(g) our adaptive MRF model.
Fig. 6. A frame from the video sequence Foreman: (a) with missing GOB’s,
reconstructed using (b) a nonadaptive GMRF model; (c) a suboptimal MRF
model; (d) the method proposed in [4]; (e) the method proposed in [3]; and (f)
our adaptive MRF model.
MRF model. By comparing the figures, the superior perfor-
mance of our method becomes obvious. This performance ad-
vantage is demonstrated in the blocks that contain edges. For a
quantitative evaluation, Table I provides the PSNR values of the
above concealment methods (for both packetization cases) for
the video sequence Foreman. The table demonstrates that our
method outperforms the other methods by at least 2 dB.
Finally, we compare the computational complexity of our
method in terms of required number of additions and multiplica-
tions to the methods proposed in [4] and [3]. The total number
of additions and multiplications required for restoring a 16
16 block are approximately 4500 and 20 000 using the method
proposed in [4] and our method, respectively. The method pro-
posed in [3] requires approximately 650 000 additions and mul-
TABLE I
PSNR (dB) COMPARISON OF DIFFERENT
METHODS FOR THE VIDEO SEQUENCE FOREMAN
tiplications to restore a 16 16 block. Therefore, the number
of computations of our method is 4.4:1 larger than that of [4]
and 1:0.03 less than those of [3]. However, our method out-
performs [4] substantially in terms of the quality of the recon-
structed images. This can be seen in Figs. 5 and 6. This quality
improvement justifies, in most applications, the additional com-
putational complexity.
VI. CONCLUSION
This paper introduces a new method of error concealment
for block-based coded video communications over error-prone
channels. The proposed method employs an adaptive MRF
as the image a priori model in a MAP estimation paradigm.
The adaptation enables the estimation procedure to incorporate
more information without increasing the order of the MRF.
The proposed concealment method is able to restore missing
blocks located in smooth and low-frequency areas, as well as
in high-frequency and edge portions of a video frame. Our
concealment method achieves very good computation–per-
formance tradeoffs, and outperforms previously proposed
MRF-based concealment methods.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewers for
their insightful comments and constructive suggestions. The au-
thors would also like to gratefully acknowledge the help and
support provided by G. Cote, their colleague in the Signal Pro-
cessing and Multimedia Group at the University of British Co-
lumbia.
REFERENCES
[1] M. Ghanbari and V. Seferidis, “Cell loss concealment in ATM video
codecs,” IEEE Trans.Circuit System Video Technol., vol. 3, pp. 238–247,
June 1993.
[2] W. M.Lam, A. R. Reibman, and B. Liu, “Recovery of lost or erroneously
received motion vectors,” in Proc. Int. Conf. Acoust., Speech, Signal
Processing, vol. V, 1993, pp. 417–420.
[3] Y. Wang, Q. F. Zhu, and L. Shaw, “Maximally smooth image recovery
in transform coding,” IEEE Trans. Commun., vol. 41, pp. 1544–1551,
Oct. 1993.
[4] S. S. Hemami and T. H. Y. Meng, “Transform coded image reconstruc-
tion exploiting interblock correlations,” IEEE Trans. Image Processing,
vol. 4, pp. 1023–1027, July 1995.
[5] H. Sun and W. Kwok, “Concealment of damaged block transform coded
images using projection onto convex sets,” IEEE Trans. Image Pro-
cessing, vol. 4, pp. 470–477, Apr. 1995.
[6] W. Kwok and H. Sun, “Multi-directional interpolation for spatial error
concealment,” IEEE Trans. Consumer Electron., vol. 39, pp. 455–460,
Aug. 1993.
[7] W. Zeng and B. Liu, “Geometric-structure-based error concealment with
novel applications in block-based low-bit-rate coding,” IEEE Trans. Cir-
cuits Syst. Video Technol., vol. 9, pp. 648–665, June 1999.
1128 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000
[8] P. Salama, N. Shroff, E. J. Coyle, and E. J. Delp, “Error concealment
in encoded video streams,” in Signal Recovery Techniques for Image
and Video Compression and Transmission, N. P. Galatsanos and A. K.
Katsaggelos, Eds. Boston, MA: Kluwer Academic, 1998.
[9] R. Talluri, “Error resilient video coding in the MPEG-4 standard,” IEEE
Commun. Mag., vol. 26, pp. 112–119, June 1998.
[10] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distribution, and
the Bayesian restoration of images,” IEEE Trans. Pattern Anal. Machine
Intell., vol. 11, pp. 689–691, 1984.
[11] S. Shirani, F. Kossentini, and R. Ward, “Reconstruction of motion vector
missing macroblocks in H.263 encoded video transmission over lossy
networks,” in Proc. Int. Conf. Image Processing, vol. III, Oct. 1998, pp.
487–491.
[12] G. Sullivan, “A Simple Video Packet Mux Simulator Program for Video
Streams in H.263/M Using AL3 Mux of H.223 Annex B,” ITU-T Study
Group 15, Video Coding Expert Group, Doc. Q15-F-16, Nov. 1996.
[13] Ericsson Sweden, “WCDMA Error Patterns at 64 kb/s,” ITU-T Study
Group 16, Multimedia Terminals and Systems Expert Group, Cannes,
France, June 1998.
Shahram Shirani (S’88–M’89) received the B.Sc.
degree in electrical engineering from Esfahan
University of Technology, Esfahan, Iran, in 1989 and
the M.Sc. degree in biomedical engineering from
Amirkabir University of Technology, Tehran, Iran
in 1994. Currently, he is pursuing the Ph.D. degree
in the Department of Electrical and Computer Engi-
neering, University of British Columbia, Vancouver,
B.C., Canada.
From 1994 to 1996, he was with the Department
of Electrical Engineering, University of Tehran. His
research interests include image and video compression, video communications,
signal processing and ultrasonic imaging.
Faouzi Kossentini (S’89–M’89–SM’98) received
the B.S., M.S., and Ph.D. degrees from the Georgia
Institute of Technology, Atlanta, in 1989, 1990, and
1994, respectively.
During 1995, he was a Research Scientist at
Nichols Research Corporation, Huntsville, AL.
Since January 1996, he has been an Assistant
Professor and then an Associate Professor in the
Department of Electrical and Computer Engineering,
University of British Columbia, vancouver, B.C.,
Canada, where he is involved in research in the areas
of signal processing, communications, and multimedia. He has coauthored
more than 100 journal papers, conference papers, book chapters and patents.
Dr. Kossentini has been active as a Voting Member, and recently as head
of delegation, of the Canadian delegate to ISO/IEC JTC1/SC29, which is re-
sponsible for the standardization of coded representation of audiovisual, mul-
timedia, and hypermedia information. In particular, he has participated in most
current JBIG/JPEG and MPEG-4 standardization activities. He has also partic-
ipated in most current ITU-T low bit rate video coding standardization activi-
ties. He has served as a technical area coordinator and member of the technical
program committee of ICIP-1997, and as a Member of the technical program
committee of ISCAS-1999. He is also the Vice General Chairman of ICIP-2000.
Dr. Kossentini is currently an Associate Editor for the IEEE TRANSACTIONS ON
IMAGE PROCESSING. He is also an Associate Editor for the IEEE TRANSACTIONS
ON MULTIMEDIA.
Rabab Ward (F’98) received the B.Sc. degree in
electrical engineering from University of Cairo,
Egypt, in 1966 and the M.Sc. and the Ph.D. in elec-
trical and computer engineering from University of
California, Berkeley, in 1969 and 1972, respectively.
She is a Professor in the Electrical and Computer
Engineering Department, University of British Co-
lumbia, Vancouver, B.C., Canada, and is the Director
of the Center for Integrated Computer Systems Re-
search there. Her research interests are mainly in the
areas of signal processing and image processing. She
has made contributions in the areas of signal detection, image encoding, com-
pression, recognition, restoration and enhancement, and their applications to
infant cry signals, cable TV, HDTV, medical images, and astronomical images.
She holds five patents related to cable television picture monitoring, measure-
ment and noise reduction.
Dr. Ward is a fellow of the EIC and the Royal Society of Canada. She is
currently serving as the General Chair of ICIP’2000 to be held in Vancouver.
... In the algorithm presented in [93], the motion vectors are recovered and after motion compensated block replenishment, the quality of the loss concealed video is improved by MAP estimator. The MAP estimator employs Gauss-Markov Random Field (GMRF) with an eight-pixel clique as the image a priori model (see Fig. 13). ...
... The MAP estimator employs Gauss-Markov Random Field (GMRF) with an eight-pixel clique as the image a priori model (see Fig. 13). In the An improved version of [93] was presented in [94] where converging to globally optimal values and not getting trapped into sub-optimal ones are guaranteed using multi-scale MRF. Furthermore, the motion compensated pixels in the previous frame in addition to the spatially neighbouring pixels are used for MAP estimation. ...
... Furthermore, the motion compensated pixels in the previous frame in addition to the spatially neighbouring pixels are used for MAP estimation. The work also proposes an adaptive potential function, which has a quadratic form in [93], but it is adapted in [94] based on the discontinuity of the pixels at the borders on the lossy area. The algorithms of [93] and [94] are applicable for the lossy blocks near the loss-free blocks; they are not useful for the inner blocks of the lossy area. ...
Article
Full-text available
Despite of the recent progresses in reliable and high bandwidth communication, packet loss is still probable and needs special attention in real-time video streaming applications. Congestion and bit error rate, which sometimes are more than the protection capability of the channel codes, are the sources of packet loss in video communication. One common approach to deal with video packet loss is to use error concealment techniques, which estimate the non-received data as close as possible to the actual data. This article reviews the temporal video error concealment methods that have been developed over the past 30 years. The techniques are categorized into 8 groups, and the methods are covered with enough details. The strengths and weaknesses of the 8 groups are also tabulated, and some suggestions for future work and open areas for research are provided.
... Since it is important for accurate restoration to correctly reconstruct missing edges, several methods have been proposed [6,7]. The method proposed in Ref. 6 extracts edges by using the Hough transform and interpolates the intensities by using the NURBS function to reconstruct the missing edges. ...
... The method proposed in Ref. 6 extracts edges by using the Hough transform and interpolates the intensities by using the NURBS function to reconstruct the missing edges. The method proposed in Ref. 7 uses the MRF model [11] and can reconstruct edges accurately. These methods can accurately detect nonbranched edges that pass through the missing areas. ...
... For the experiments, we add some missing blocks with a size of 16 × 16 pixels to three standard images, Peppers, Lena, and Zelda (512 × 512, 8-bit gray scale) in arbitrary places to generate corrupted images. Since the previous methods [6,7] used for comparisons in the following subsection have considered the situation of the 16 × 16 blocks, these blocks are used in this experiment. Varying L in 2 increments from 5 to 11 and α in 0.1 increments from 0.1 to 1.0, we restore the corrupted images and present the average values of the PSNR calculated for the restored blocks in Fig. 5. ...
... It explains Layered Coding with prioritized transport includes frequency domain partitioning, Successive amplitude refinement and spatial/temporal resolution refinement. Note that these techniques are not mutually exclusive; rather, they can be used together in a complementary way [10][11][12][13][14][15][16]. ...
... Error concealment methods which will be implemented on the receiver side to restore the missing and corrupted video frame using the previously decoded video data [4]. It will be noted that various post processing technique are successfully implemented such as Motion compensated temporal prediction, Spatial Interpolation, Maximally Smooth recovery, POCS, Fuzzy logic which is review by Y.Wang [10][11][12][13][14][15][16]. www.ijacsa.thesai.org ...
Article
Full-text available
With advancement of wireless technology and the processing power in mobile devices, every handheld device supports numerous video streaming applications. Generally, user datagram protocol (UDP) is used in video transmission technology which does not provide assured quality of service (QoS). Therefore, there is need for video post processing modules for error concealments. In this paper we propose one such algorithm to recover multiple lost blocks of data in video. The proposed algorithm is based on a combination of wavelet transform and spatio-temporal data estimation. We decomposed the frame with lost blocks using wavelet transform in low and high frequency bands. Then the approximate information (low frequency) of missing block is estimated using spatial smoothening and the details (high frequency) are added using bidirectional (temporal) predication of high frequency wavelet coefficients. Finally inverse wavelet transform is applied on modified wavelet coefficients to recover the frame. In proposed algorithm, we carry out an automatic estimation of missing block using spatio-temporal manner. Experiments are carried with different YUV and compressed domain streams. The experimental results show enhancement in PSNR as well as visual quality and cross verified by video quality metrics (VQM).
... However, these techniques might not be effective if the lost area is large. Another approach considered is motion vector estimation during error concealment in the decoder, e.g., (Tsekeridou and Pitas, 2000) and (Shirani et al., 2000b), similar to the motion estima-tion in a video encoder. Error concealment based on neighboring motion vectors, e.g., (Chen et al., 1997), assumes that the surrounding motion vectors to the lost macroblock (MB) are available. ...
... Temporal error concealment techniques utilize temporally neighboring (previous or next) pictures to recover the damaged pixels in the current picture. Boundary matching [5] [6], motion vector smoothing [7], motion field interpolation [8] and maximum a posteriori (MAP) based estimation [9] are few examples of various temporal error concealment techniques. Advantages of spatial and temporal techniques have been combined in joint spatial and temporal error concealment algorithms [10] [11]. ...
Conference Paper
Full-text available
Video communication over error-prone channels requires well-designed loss concealment techniques to develop the quality of services without using extra bandwidth. Modern video coding standards allow to further reduction in the bit rate and in very low bit-rate video transmission, absence of one or more video packets may lead in loss of one or more consecutive video frames. Although, several spatial and temporal error concealment algorithms are introduced in standard video codecs and literatures, none of them is capable to apply for whole-frame restoration because of the fact that no neighboring macroblock is available to employ intra or inter frame error resilience approaches. In this paper, a real-time adaptive concealment method based on generalized regression neural networks is proposed to restore lost predictively coded pictures in conversational video streams and suppress error propagation through them. Our simulations result up to 4.5 dB average improvement in quality over conventional frame copy approach.
... gradient or Laplacian) between adjacent pixels [93], each pixel in the damaged block is interpolated from the corresponding 28 3. Application of the statistical model to error concealment and quality enhancement of video pixels in its four neighboring blocks such that the total squared border error is minimized [41], or the missing information is interpolated utilizing spatially correlated edge information from a large local neighborhood [89]. Statistical models like the Markov random fields (MRF) have also been proposed for error concealing in video [75], [82]. These methods estimate the missing pixels by exploiting spatial or spatio-temporal constraints between pixels in the original sequence. ...
Article
We propose a joint source-channel coding scheme, developed for video sequences, which consists of a vector quantization based on lattice constellations and a linear labelling minimizing simultaneously the source and the channel distortion. The linear labelling has already been proved to minimize the channel distortion on binary symmetric channels and the linear transforms based on lattice constellations of maximum diversity to minimize, at the same time, the distortion of Gaussian sources. We study the dependencies between the wavelets coefficients of a $t+2D$ video decomposition in order to efficiently exploit the linear transforms developed for Gaussian sources. As the source distribution of the subbands is not Gaussian we present the necessary modifications in order to obtain a robust coding scheme. We propose a stochastic model to capture the dependencies between the wavelets coefficients and we use it to build an optimal mean square predictor for missing coefficients. We present two applications of this predictor on the transmission over packet networks: a quality enhancement technique for resolution scalable video bitstreams and an error concealment method. We develop a robust joint source-channel coding scheme for transmission of video sequences over a Gaussian channel using uncoded and coded index assignment via Reed-Muller codes. We investigate the conditions requiring the use of a coded index assignment and we prove its superiority compared to an unstructured vector quantizer. For a transmission over a flat-Rayleigh channel, we develop a robust coding scheme using our vector quantizer followed by a rotation matrix.
Article
To overcome the degradation of video quality in intra frame caused by transmission errors, a novel spatial error concealment algorithm using minimum variance of edge of border pixels is proposed. For every lost block, the neighboring pixel which is strongest relevant to border pixel is calculated and the edge direction is estimated. Then the minimum variance of edge of border pixels is employed to recover the lost pixel, which efficiently exploits the estimated edge direction and adjacent border pixels. The simulation results show that compared with the conventional algorithms, the proposed algorithm can highly improve the quality of reconstructed video and obtain a gain of about 0.2~2.5 dB in PSNR for different rates of lost block or different video sequences.
Article
The technique of error concealment can estimate the pixel values in a lost block during video transmission and improve the image quality. However, the existing error concealment algorithms have a shortcoming of over-smoothing and lose some image edge information inside the lost block. To overcome this shortcoming, a new error concealment algorithm is proposed in this paper. In this algorithm, it first utilizes the error concealment method in H.264 to make initial concealment. After that, the image is transformed in the Harr wavelet domain to further enhance the edge information. After the reverse wavelet transform, the image is finally enhanced by using anisotropic diffusion. Simulation results demonstrate that the proposed algorithm can well keep the image edge information in the edge concealment, avoiding the problem of over-smooth and edge information losses in the conventional error concealment algorithms.
Chapter
This chapter discusses the error-resilient coding and decoding strategies for video communication. It is important to understand that video can benefit significantly if the transmitter can be sure that the video will be delivered reliably. Typically, the introduction of error-resilience tools in the video coding layer is very costly in terms of compression efficiency. The overhead is in general much better spent in lower layers of the protocol stack. Nevertheless, there exist applications in which errors are inevitable. If the video encoder is not aware of distortions on the transmission link, this in general leads to dramatic quality degradations due to instantaneous errors as well as spatial-temporal error propagation. Whereas the effect of instantaneous errors can be decreased by the use of specific packetization modes, the usually more severe effect of error propagation can be reduced by the application of more frequent intra-information, interactive error control, or a combination of both.
Article
When transmitting compressed video over a data network, one has to deal with how channel errors affect the decoding process. This is particularly a problem with data loss or erasures. In this paper we describe techniques to address this problem in the context of asynchronous transfer mode (ATM) networks. Our techniques can be extended to other types of data networks such as wireless networks. In ATM networks channel errors or congestion cause data to be dropped, which results in the loss of entire macroblocks when MPEG video is transmitted. In order to reconstruct the missing data, the location of these macroblocks must be known. We describe a technique for packing ATM cells with compressed data, whereby the location of missing macroblocks in the encoded video stream can be found. This technique also permits the proper decoding of correctly received macroblocks, and thus prevents the loss of ATM cells from affecting the decoding process. The packing strategy can also be used for wireless or other types of data networks. We also describe spatial and temporal techniques for the recovery of lost macroblocks. In particular, we develop several optimal estimation techniques for the reconstruction of missing macroblocks that contain both spatial and temporal information using a Markov random field model. We further describe a sub-optimal estimation technique that can be implemented in real time
Article
An algorithm for lost signal restoration in block- based still image and video sequence coding is presented. Prob- lems arising from imperfect transmission of block-coded images result in lost blocks. The resulting image is flawed by the absence of square pixel regions that are notably perceived by human vision, even in real-time video sequences. Error concealment is aimed at masking the effect of missing blocks by use of tem- poral or spatial interpolation to create a subjectively acceptable approximation to the true error-free image. This paper presents a spatial interpolation algorithm that addresses concealment of lost image blocks using only intra-frame information. It attempts to utilize spatially correlated edge information from a large local neighborhood of surrounding pixels to restore missing blocks. The algorithm is a Gerchberg-type spatial domainkpectral domain constraint-satisfying iterative process, and may be viewed as an alternating projections onto convex sets method. In this paper, we introduce a proposed method of error concealment based on projections onto convex sets (POCS) (7). The POCS method of image restoration attempts to satisfy a priori characteristics typical of most natural video images. The algorithm iterates between satisfying spatial domain and spectral domain constraints, much like Gerchberg's method (91. We investigate the proposed algorithm by performing simulation on typical video images. The algorithm is judged on how well it performs with varying degrees of error localization (i.e. how large the damaged region is). Typical rates of algo- rithm convergence are determined. An objective figure of merit using peak-signal-to-noise ratio will assess the improvement gains over simpler methods of concealment. Pictorial results will demonstrate the subjective quality of the restoration. 11. BACKGROUND
Article
We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.
Conference Paper
Errors caused by loss of coded data can seriously affect an H.263 decoded image sequence. Several scenarios may occur that include: (1) loss of macroblocks in I or P frames, and (2) loss of motion vectors of macroblocks in P frames. The missing macroblocks in I and P frames can be reasonably reconstructed by exploiting the correlation between adjacent macroblocks. Existing methods which reconstruct the motion vector of a macroblock rely on existing motion vectors of surrounding macroblocks, and the results are not always satisfactory. A novel reconstruction technique for restoration of macroblocks with missing motion vectors is proposed. This method exploits the image continuity inside and across the borders of the macroblocks. Simulation results indicate that the performance of the proposed algorithm is good, both subjectively and objectively
Conference Paper
A technique using boundary matching to compensate for lost or erroneously received motion vectors in motion-compensated video coding is proposed. This technique, called the boundary matching algorithm, produces noticeably better results than those reported previously. It is first assumed that the displaced frame differences have no error. Then, this assumption is relaxed by proposing an algorithm (the extended boundary matching algorithm or EBMA) which can recover both the missing displaced frame differences and the missing motion vectors. The resulting images obtained using these methods and some other methods are compared. The images obtained clearly show that better image quality can be obtained by EBMA
Article
Transmission of still images and video over lossy packet networks presents a reconstruction problem at the decoder. Specifically, in the case of block-based transform coded images, loss of one or more packets due to network congestion or transmission errors can result in errant or entirely lost blocks in the decoded image. This article proposes a computationally efficient technique for reconstruction of lost transform coefficients at the decoder that takes advantage of the correlation between transformed blocks of the image. Lost coefficients are linearly interpolated from the same coefficients in adjacent blocks subject to a squared edge error criterion, and the resulting reconstructed coefficients minimize blocking artifacts in the image while providing visually pleasing reconstructions. The required computational expense at the decoder per reconstructed block is less than 1.2 times a non-recursive DCT, and as such this technique is useful for low power, low complexity applications that require good visual performance
Article
This paper first proposes a computationally efficient spatial directional interpolation scheme, which makes use of the local geometric information extracted from the surrounding blocks. The proposed error-concealment scheme produces results that are superior to those of other approaches, in terms of both peak signal-to-noise ratio and visual quality. Then a novel approach that incorporates this directional spatial interpolation at the receiver is proposed for block-based low-bit-rate coding. The key observation is that the directional spatial interpolation at the receiver can reconstruct faithfully a large percentage of the blocks that are intentionally not sent. A rate-distortion optimal way to drop the blocks is shown. The new approach can be made compatible with standard JPEG and MPEG decoders. The block-dropping approach also has an important application for dynamic rate shaping in transmitting precompressed videos over channels of dynamic bandwidth. Experimental results show that the proposed coding and rate-shaping systems can provide significant subjective and objective gains over conventional approaches
Article
Methods for the interpolation of lost cells in asynchronous-transfer-mode (ATM) networks are studied. It is shown that use of motion-compensated previous frames gives the best results. The quality of the interpolated pictures improves if the motion vectors truly represent the actual motion in the scene. This is only possible with a two-layer coding scheme, where the motion vectors can be delivered to the decoder through the base-layer guaranteed channel. In derivation of the motion vectors at the encoder, use of uncoded input picture frames outperforms the conventional method of motion extraction from the previous coded pictures, despite the lower bit rate of the latter to the former. Depending on the quality of the base layer and the scene activity, the signal-to-noise ratio (SNR) in the cell-loss-interpolated areas can be improved by up to 10 dB
Article
This article describes error resilience aspects of the video coding techniques that are standardized in the ISO MPEC-4 standard. It begins with a description of the general problems in robust wireless video transmission. The specific tools adopted into the ISO MPEG-4 standard to enable the communication of compressed video data over noisy wireless channels are presented in detail. These techniques include resynchronization strategies, data partitioning, reversible variable length codes (VLCs), and header extension codes. An overview of the evolving ISO MPEG-4 standard and its current status are described