ArticlePDF Available

Implications of Smoothing on Statistical Multiplexing of H.264/AVC and SVC Video Streams

October 2009
IEEE Transactions on Broadcasting 55(3):541 - 558

October 2009
55(3):541 - 558

DOI:10.1109/TBC.2009.2027399

Source
IEEE Xplore

Authors:

Geert Van der Auwera

Qualcomm

Martin Reisslein

Arizona State University

While the hierarchical B frames based scalable video coding (SVC) extension of the H.264/AVC standard achieves significantly improved compression over the initial H.264/AVC codec, the SVC video traffic is significantly more variable than H.264/AVC traffic. The higher traffic variability of the SVC encoder can lead to smaller numbers of streams supported with bufferless statistical multiplexing than with the H.264/AVC encoder (and even less streams than with the MPEG-4 Part 2 encoder) for prescribed link capacities and loss constraints. In this paper we examine the implications of video traffic smoothing on the numbers of statistically multiplexed H.264 SVC, H.264/AVC, and MPEG-4 Part 2 streams, the bandwidth requirements for streaming, and the introduced delay. We identify the levels of smoothing that ensure that more H.264 SVC streams than H.264/AVC streams can be supported. For a basic low-complexity smoothing technique that is readily applicable to both live and prerecorded streams, we identify the levels of smoothing that give (bufferless) statistical multiplexing performance close to an optimal off-line smoothing technique. We thus characterize the trade-offs between increased smoothing delay and increased statistical multiplexing performance for both H.264/AVC, which employs classical B frames, and H.264 SVC, which employs hierarchical B frames. We similarly identify the buffer sizes for the buffered multiplexing of unsmoothed H.264 SVC, H.264/AVC, and MPEG-4 Part 2 streams that give close to optimal performance.

J simulation (SIM) and RD curves for five long CIF sequences encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4 Part 2 (G16-B3). The channel capacity is C = 20 Mbps and the bit loss probability is = 10. J curves are provided for unsmoothed (unsm) and optimally

…

J simulation (SIM) and RD curves for the Sony Demo and NBC 12 News sequences encoded with H.264 SVC (G16-B15), H.264/AVC (G16-B3), and MPEG-4 Part 2 (G16-B3) for unsmoothed traffic (unsm) and for optimally smoothed traffic (48 KB). The channel capacity is C = 400 Mbps, the bit loss probability is = 10. Perfect CBR (PCBR) J curves are included for comparison: (a) Sony Demo; (b) NBC 12 News.

…

Minimum channel capacity C simulation results for the Silence of the Lambs and NBC 12 News sequences encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4 Part 2 (G16-B3) for unsmoothed video traffic. The bit loss probability is = 10 and the numbers of streams are J = 4, 16, and 64: (a) Silence of the Lambs; (b) NBC 12 News.

…

Minimum channel capacity C simulation results for the Sony Demo sequence encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4 Part 2 (G16-B3) for unsmoothed video traffic. The bit loss probabilities are = 10 ; 10 ; and 10 and the number of streams are J = 4, 16, and 64: (a) Sony Demo, = 10 ; (b) Sony Demo, = 10 ; (c) Sony Demo, = 10 .

…

C simulation results for five long CIF sequences encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4 Part 2 (G16-B3) for optimally smoothed (48 KB) video traffic with a 48KB client buffer. The bit loss probability is = 10 and the number of streams are J = 4, 16, and 64: (a) Silence of the Lambs; (b) Star Wars IV; (c) Sony Demo; (d) NBC 12 News; (e) Tokyo Olympics.

…

Figures - uploaded by Geert Van der Auwera

Content may be subject to copyright.

Content uploaded by Geert Van der Auwera

Content may be subject to copyright.

IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009 541

Implications of Smoothing on Statistical Multiplexing

of H.264/AVC and SVC Video Streams

Geert Van der Auwera and Martin Reisslein

Abstract—While the hierarchical B frames based Scalable Video

Coding (SVC) extension of the H.264/AVC standard achieves sig-

niﬁcantly improved compression over the initial H.264/AVC

codec, the SVC video trafﬁc is signiﬁcantly more variable than

H.264/AVC trafﬁc. The higher trafﬁc variability of the SVC

encoder can lead to smaller numbers of streams supported

with bufferless statistical multiplexing than with the H.264/AVC

encoder (and even less streams than with the MPEG-4 Part 2

encoder) for prescribed link capacities and loss constraints. In this

paper we examine the implications of video trafﬁc smoothing on

the numbers of statistically multiplexed H.264 SVC, H.264/AVC,

and MPEG-4 Part 2 streams, the bandwidth requirements for

streaming, and the introduced delay. We identify the levels

of smoothing that ensure that more H.264 SVC streams than

H.264/AVC streams can be supported. For a basic low-complexity

smoothing technique that is readily applicable to both live and

prerecorded streams, we identify the levels of smoothing that

give (bufferless) statistical multiplexing performance close to

an optimal off-line smoothing technique. We thus characterize

the trade-offs between increased smoothing delay and increased

statistical multiplexing performance for both H.264/AVC, which

employs classical B frames, and H.264 SVC, which employs hier-

archical B frames. We similarly identify the buffer sizes for the

buffered multiplexing of unsmoothed H.264 SVC, H.264/AVC, and

MPEG-4 Part 2 streams that give close to optimal performance.

Index Terms—Delay, H.264/AVC, hierarchical B frames,

smoothing, statistical multiplexing, SVC, video trafﬁc.

I. INTRODUCTION

THE recently standardized Scalable Video Coding ex-

tension (SVC) of the H.264/AVC standard [1]–[3]

with its hierarchical B-frames compresses single-layer

(non-scalable) video signiﬁcantly more efﬁciently than the

underlying H.264/MPEG-4 Advanced Video Coding stan-

dard [4] (H.264/AVC for brevity), which is also known as

H.264/MPEG-4 Part 10. H.264/AVC in turn compresses video

signiﬁcantly more efﬁciently than MPEG-4 Part 2 (typically

only half the average bit rate with H.264/AVC for same video

quality). H.264/AVC and H.264 SVC video encoding are

expected to be widely adopted for wired and wireless network

Manuscript received May 28, 2008; revised April 20, 2009. First published

August 11, 2009; current version published August 21, 2009. This work was

supported in part by the National Science Foundation through Grants No. Career

ANI-0133252, ANI-0136774, and CRI-0750927.

G. Van der Auwera was with the Department of Electrical Engineering, Ari-

zona State University, Tempe, AZ 85287-5706 USA. He is now with Samsung

Information Systems America, Digital Media Solutions Lab, Irvine, CA 92612

(e-mail: geert.vanderauwera@asu.edu).

M. Reisslein is with the Department of Electrical Engineering, Arizona

State University, Tempe, AZ 85287-5706 USA (e-mail: reisslein@asu.edu;

http://www.fulton.asu.edu/mre).

Digital Object Identiﬁer 10.1109/TBC.2009.2027399

video transport due to their increased compression efﬁciency

compared to MPEG-4 Part 2 and their widespread inclusion

in application standards and industry consortia speciﬁcations,

e.g., DVB, 3GPP2, and MediaFLO.

The compression efﬁciency of a video codec is generally

characterized with a so-called rate-distortion (RD) curve that

shows the bit rate of the compressed video stream as a function

of the video quality (distortion), which is typically measured in

terms of the Peak Signal to Noise Ratio (PSNR). For a given

video quality, the lower the compressed bit rate, the more efﬁ-

cient is the compression. The improvements in rate-distortion

(RD) compression efﬁciency with H.264 SVC and H.264/AVC

come at the expense of signiﬁcantly increased variabilities of

the encoded frame sizes (in bits) [5]. Highly variable video

frame sizes, i.e., highly variable video trafﬁc, generally poses

a challenge for efﬁcient network transport [6]–[8]. When the

video frame sizes are highly variable, i.e., when the largest

frames are much larger than the average frame size, then pro-

visioning network bandwidth according to the largest frames

results in inefﬁcient bandwidth usage. The basic idea of sta-

tistical multiplexing is that the largest frames of some video

streams collude with average (or smaller than average sized)

frames of other streams during network transport. With this

statistical multiplexing, the bandwidth requirement is typically

dramatically less than the sum of the peak bit rates of the sup-

ported streams, and may approach the sum of the mean bit rates

of the supported streams. Consequently, statistical multiplexing

is of great interest for network systems transporting video with

variable frame sizes.

However, it was found in [9] that the H.264/AVC encoder can

outperform the H.264 SVC encoder and that even the MPEG-4

Part 2 encoder can outperform both the H.264/AVC and H.264

SVC encoders when multiplexing a small number of video

streams in an elementary bufferless statistical multiplexing

setting. This is due to signiﬁcantly higher trafﬁc variabilities of

H.264 SVC encoded video streams compared to H.264/AVC

encoded streams, as well as the signiﬁcantly higher trafﬁc

variabilities of both H.264 SVC and H.264/AVC encoded video

streams compared to MPEG-4 Part 2 encoded streams. The

higher trafﬁc variabilities can compensate the lower average

bit rates achieved with H.264 SVC encoding compared to

H.264/AVC encoding, as well as the lower average bit rates

achieved by both H.264 SVC and H.264/AVC compared to

MPEG-4 Part 2.

In this paper we examine the effectiveness of two elementary

techniques for mitigating high trafﬁc variability, namely (i)

video trafﬁc smoothing, i.e., the averaging of several successive

frame sizes before sending them into the bufferless multiplexer,

and (ii) buffered multiplexing of unsmoothed video streams.

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

542 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

From the wide spectrum of video trafﬁc smoothing techniques

we consider two extreme approaches: optimal smoothing

[10], [11], which minimizes the trafﬁc variabilities, and basic

smoothing, which simply averages (aggregates) the sizes of a

prescribed number of successive video frames, whereby the

number of averaged video frames is denoted by the aggrega-

tion level . Optimal smoothing achieves the minimal trafﬁc

variability subject to given smoothing (receiver) buffer and

start-up delays by computing ofﬂine the transmission schedule

that delivers each video frame by its playout deadline while

avoiding overﬂows of the smoothing buffer and minimizing

transmission rate changes. Optimal smoothing has a computa-

tional complexity of , whereby denotes the number

of frames in the sequence and can not be directly applied to

live streams. In contrast, basic smoothing is computationally

very simple (has complexity ) and can directly be ap-

plied to live streams. For a range of numbers of statistically

multiplexed streams and video (texture/motion) complexities,

we provide guidelines for (i) setting the aggregation levels

of basic smoothing that ensure that more H.264 SVC streams

than H.264/AVC streams are supported, and (ii) setting the

aggregation levels that provide similar statistical multiplexing

performance with basic smoothing as with optimal smoothing.

We ﬁnd that generally SVC requires larger aggregation levels

to overcome its higher trafﬁc variabilities. We also examine

the delay introduced by the hierarchical B frame predictions in

H.264 SVC in conjunction with the aggregation levels for the

trafﬁc smoothing and compare with the corresponding delays

for H.264/AVC.

We also examine elementary taildrop buffered statistical mul-

tiplexing of unsmoothed video streams. We identify the multi-

plexer buffer sizes required to support close to the maximum

number of streams (given by the link capacity divided by the av-

erage stream bit rate). We ﬁnd that H.264 SVC streams require

roughly twice the buffer size of H.264/AVC streams, while in

turn H.264/AVC streams require approximately twice the buffer

size of MPEG-4 Part 2 streams.

This paper is structured as follows. In Section II, we review

related work. In Section III, we present our evaluation set-up,

including the examined H.264 SVC, H.264/AVC, and MPEG-4

Part 2 encoders and their settings, as well as the video sequences

used for the evaluations. In Section IV, we ﬁrst describe the em-

ployed basic and optimal smoothing techniques and the consid-

ered bufferless statistical multiplexing setting. We then present

simulation results for optimal smoothing, followed by simula-

tion results for basic smoothing. In Section V, we ﬁrst describe

the examined elementary buffered statistical multiplexing sce-

nario, and then present simulation results. We summarize our

conclusions in Section VI and analyze the delays for smoothed

transmission of video encoded with classical and hierarchical B

frames in the Appendix.

II. RELATED WORK

For MPEG-4 Part 2, H.263, and preceding codecs, the bit

rate-distortion characteristics and rate variability characteristics

have been extensively studied, see for instance [12]–[14] and

references therein. Similarly, the video trafﬁc of these codecs

has been extensively studied, see for instance [15]–[19], and

they have been used as a basis for the existing studies on video

trafﬁc smoothing, as reviewed in Section IV-A, and buffer man-

agement, as reviewed in Section V.

The bit rate-distortion characteristics of H.264/AVC and

H.264 SVC have been examined in a few studies [3], [4], [20]

and the rate variability characteristics of H.264/AVC and H.264

SVC have been investigated in [5], [9], [21]. The study of

network transport mechanisms for H.264/AVC and H.264 SVC

has just begun to attract interest, see for instance the studies

[22]–[27], all of which are complementary to our study exam-

ining the fundamental statistical multiplexing characteristics.

We note that the trafﬁc characteristics of individual smoothed

H.264/AVC and H.264 SVC streams have been studied in [9];

furthermore, the bufferless statistical multiplexing of un-

smoothed H.264/AVC and H.264 SVC streams has been

examined in [9]. To the best of our knowledge, the fundamental

bufferless statistical multiplexing characteristics of smoothed

H.264/AVC and H.264 SVC video and buffered multiplexing

characteristics of unsmoothed H.264/AVC and H.264 SVC

video are for the ﬁrst time examined in this paper.

III. EVALUATION SET-UP

A. Video Encoding Set-Up

We employ the H.264/AVC encoder [4], [20], [28]–[30] in

the Main proﬁle with all compression tools enabled, including

spatial intra frame prediction, variable block sizes, three refer-

ence frames for the past and the future, referenced B frames,

P and B frame weighted prediction, Context Adaptive Binary

Arithmetic Coding (CABAC), and Lagrangian based rate-dis-

tortion optimization (RDO). In particular, we employ the JM

reference software (version 10.2), which is the ofﬁcial MPEG

and ITU reference implementation for the H.264/AVC Main

proﬁle. For the H.264 SVC encodings, we used the SVC refer-

ence software named JSVM (version 5.9), and similar settings

as for H.264/AVC.

Throughout, we employ H.264/AVC with classical B frame

prediction, where a B frame is predicted only from the preceding

I or P frame and from the subsequent I or P frame; other B

frames are not referenced. In contrast, H.264 SVC [1]–[3] em-

ploys the hierarchical B frame structure which uses B frames

for the prediction of B frames, as illustrated in the Appendix.

More speciﬁcally, with the employed dyadic B frame hierarchy,

the number of B frames between successive key pictures (I or

P frames) is

(1)

of so-called temporal layers of B frames.

We use the MPEG-4 Part 2 encoder [31], speciﬁcally the

MPEG-4 Part 2 Microsoft v2.3.0 software, in the Advanced

Simple proﬁle (ASP), which includes B frames. We employ half

pixel motion compensated prediction; RDO is not supported by

the reference encoder implementation. The MPEG-4 Part 2 en-

coder uses one reference frame for the past and one for the fu-

ture, and 16 16 blocks for motion estimation that can be split

into 8 8 blocks.

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 543

For the H.264/AVC encodings and the MPEG-4 Part 2 en-

codings, which are both based on classical B frames, we em-

ploy GoP structure IBBBPBBBPBBBPBBB (16 frames, with 3

B frames per I/P frame) denoted by G16-B3. For the H.264

SVC encodings (hierarchical B frames), we employ GoP struc-

ture IBBBBBBBBBBBBBBB (16 frames, with 15 B frames per

I frame) denoted by G16-B15. The statistical video trafﬁc anal-

ysis in [5], [9] demonstrated that these encoding parameter set-

tings and GoP structures result overall in very good rate-distor-

tion (RD) efﬁciencies for the respective encoders. The analysis

in [5] also indicated that encoding parameter settings that result

in lower RD efﬁciency generally reduce the trafﬁc variability;

conversely, settings that further increase the RD efﬁciency gen-

erally increase the trafﬁc variability (further increasing the need

for trafﬁc smoothing). In addition, the considered GoP struc-

tures provide identical random access functionalities (I frame

period). We consider quantization parameters that correspond

to the range of average PSNR qualities from either 30/32 dB

(acceptable quality) or 35 dB (good quality) to at least 40 dB

(high quality).

Throughout this study, we consider single-layer (non-scal-

able) encoding and encode the video with ﬁxed quantization

scales, which results in nearly constant video quality and vari-

able video trafﬁc bit rates. By considering variable bit rate en-

coding without the use of rate control mechanisms we are able

to examine the fundamental trafﬁc characteristics of the H.264

SVC and H.264/AVC video coding standards, which do not

specify a normative rate control mechanism. An additional mo-

tivation for the focus on variable bit rate video encoded with

ﬁxed quantization scales is that the variable bit rate streams

allow for statistical multiplexing gains that have the potential

to improve the efﬁciency of video transport over communica-

tion networks [6].

B. Video Sequences

The ﬁve CIF (352 288 pixels) resolution video sequences

employed in the statistical multiplexing simulations presented

in this study are the ten minute Sony Digital Video Camera

Recorder demo sequence (17,682 frames at 30 frames/sec),

which we refer to as Sony Demo sequence, the ﬁrst half

hour of the Silence of the Lambs movie (54,000 frames at

30 frames/sec), the ﬁrst half hour of the Star Wars IV movie

(54,000 frames at 30 frames/sec), and the ﬁrst hour of the To ky o

Olympics video (133,128 frames at 30 frames/sec). We also

use about 30 minutes of the NBC 12 News (49,523 frames at

30 frames/sec), including the commercials. These sequences

were obtained with the MEncoder tool through decoding the

original DVD sequences into the uncompressed YUV format

and subsampling to CIF resolution. The video sequences Si-

lence of the Lambs,Star Wars IV,Tokyo Olympics, and NBC 12

News can respectively be described as drama/thriller, science

ﬁction/action, sports, and news. The Sony Demo sequence

is documentary style, and is a mixture of detailed scenes

(textures) and various motion activities. The NBC 12 News

and Sony Demo videos have relatively higher motion and

texture complexity than the other three videos and pose more

challenges for statistical multiplexing as we demonstrate in

Section IV-C-1.

In order to facilitate further research on network transport of

H.264 SVC, H.264/AVC, and MPEG-4 Part 2 encoded video, all

encodings presented in this study are publicly available as video

traces from the video trace library at: http://trace.eas.asu.edu.

Frame size video traces [32] are ﬁles mainly containing video

frame time stamps, frame types (e.g., I, P, or B), encoded frame

sizes (in bits), and frame qualities (PSNR). Video traces are

employed in simulation studies of the transport of video over

communication networks, see e.g., [33]–[37], and as a basis for

video trafﬁc models, as for instance in [12], [15], [16], [19],

[38]–[41]. Trafﬁc modeling of H.264/AVC and H.264 SVC

video trafﬁc is a nascent research area, see e.g., [21], [42]–[44],

and we directly employ the video traces for a realistic repre-

sentation of H.264 video trafﬁc in our simulations. Generally,

advantages of using video traces over using regular encoded

bit streams in simulations are the availability of a large number

of traces of long and real video sequences, the fact that video

traces are not copyrighted, and that only knowledge of basic

concepts of video encoding are required.

IV. BUFFERLESS STATISTICAL MULTIPLEXING OF SMOOTHED

VIDEO TRAFFIC

A. Frame Size Smoothing

A wide variety of frame size smoothing mechanisms have

been developed and studied in the context of the MPEG-4

Part 2, H.263, and preceding video standards. Broadly, these

smoothing mechanisms can be classiﬁed into non-collabora-

tive mechanisms that smooth a single video stream, see for

instance [10], [45]–[56], and collaborative mechanisms that

jointly smooth several streams sharing networking resources,

see for instance [33], [57]–[64]. We focus on non-collaborative

smoothing in this study and leave evaluations of collaborative

smoothing for H.264 SVC and H.264/AVC for future work.

Among the non-collaborative smoothing mechanisms, we

ﬁrst consider basic smoothing of the sizes (in bit) of the video

frames over non-overlapping blocks of frames each. More

speciﬁcally, for the aggregation level , the sizes of consec-

utive frames are averaged and transmitted at the corresponding

average bit rate. Given the original (unsmoothed) frame size

sequence , , we obtain the smoothed frame

sizes

(2)

for . The aggregation level can be varied, with

larger values resulting in lower video trafﬁc variabilities at the

expense of increased delay, which is analyzed in the Appendix.

We also consider optimal smoothing [10], [11], which is op-

timal in the sense that it minimizes the bit rate variability and the

peak bit rate of the video trafﬁc subject to prescribed smoothing

(receiver) buffer and start-up delays. Optimal smoothing en-

sures that the given receiver buffer does not underﬂow nor

overﬂow, while sending video frame bits ahead of the decoding

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

544 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

times of the corresponding video frames. The optimization al-

gorithm computes the transmission schedule of the video frame

bits in piecewise constant bit rate segments that are as long as

possible and have the smallest rate changes possible, without

overﬂowing the client buffer, and while delivering the video

frames by their playout deadlines. Optimal smoothing takes

as input the frame sizes of the pre-encoded video stream and

computes the transmission schedule off-line. With denoting

the number of video frames in a pre-encoded video sequence,

the computational complexity of a basic implementation of

optimal smoothing is (whereby a complexity reduction

to is possible with a more involved implementation)

[10]. For our simulations we set the client buffer size to 48 KB

and set the (additional) start-up delay (see Appendix) to .

The 48 KB buffer ensures that for the highest quality streams

in our experiments (approximately 40dB), the largest frames

can ﬁt into the client buffer.

Although many more video trafﬁc smoothing techniques are

available, we focus on basic smoothing and optimal smoothing,

because these two techniques represent extreme situations, i.e.,

lowest computational complexity ( with basic smoothing)

and lowest achievable rate variability (with optimal smoothing).

B. Bufferless Statistical Multiplexing

In the real-time frame-based video streaming scenario based

on a bufferless statistical multiplexer [56], [65]–[67], a channel

with bandwidth capacity [bit/s] connects a streaming video

server with a bufferless statistical multiplexer to receivers.

Each video frame is transmitted during one frame period (e.g.,

33 ms for a frame rate of 30 frames/s). Let [bit] de-

note the frame size of frame , , of stream ,

. Then, the bit rate required to transmit frame

of stream during one frame period of length is given by

. Let be a random variable denoting the index

of the frame of stream transmitted during frame period .

Then, the aggregated bit rate in frame period when statisti-

cally multiplexing all streams is given

(3)

If the aggregate bit rate exceeds the link capacity , then

loss occurs, which we measure as the information loss proba-

bility [66], [67], i.e., long-run fraction of lost video bits:

(4)

where . For a given experiment, we stream

identical video sequences, whereby the starting phase for each

stream is randomly selected according to a uniform distribution

over all frames of the sequence [32], [66]. The streams are

wrapped around to obtain streams of equal lengths.

Aside from providing an appropriate model for low-delay,

low-buffer transmission systems [65], [67], bufferless statistical

multiplexing provides a “ground truth” for studying the funda-

mental implications of the bit rate variabilities associated with

the H.264 SVC, H.264/AVC, and MPEG-4 Part 2 video en-

coders and with the video content. By considering the outlined

elementary bufferless statistical multiplexing scenario, we avoid

introducing confounding parameters, such as network buffers,

cross trafﬁc, and network topology. Only the video encoder (and

its encoding settings), the video content, and the link capacity

(along with the number of streams ) inﬂuence the outcome

of the experiment and we are thus able to uncover the funda-

mental statistical multiplexing characteristics of the smoothed

H.264/AVC and H.264 SVC streams.

We note that predicting the loss probability of statistical

multiplexing from statistical descriptors of the video trafﬁc

has been extensively studied for MPEG encoded videos and

veriﬁed through simulations with traces of MPEG encoded

videos, see e.g., [56], [65]–[70]. Generally, such prediction

works relatively well when the number of multiplexed streams

is high and the streams are relatively smooth. Predicting the

loss probability when multiplexing few streams as well as for

the new H.264/AVC and H.264 SVC encodings with their high

variability is a largely open research area. In this study we

conduct extensive simulations with traces of H.264/AVC and

H.264 SVC videos for a wide range of numbers of multiplexed

streams, which can be used as a baseline for assessing the

accuracy of novel prediction mechanisms.

C. Simulation Results

In the ﬁrst set of simulations we estimate the maximum

number of video streams that can be accommodated by

the given link capacity , while constraining the information

loss probability to a value smaller than a prescribed small

constant . Many independent replications of each simulation

were run until the 90\% conﬁdence interval of the information

loss probability estimate was less than 10\% of the corre-

sponding sample mean. In the second set of simulations, we

estimate the minimum link capacity that accommodates a

prescribed number of streams subject to . For each

estimate we perform 500 runs, each consisting of 1000

independent video streaming simulations. We do not include

the 90\% conﬁdence intervals in the plots, because the

conﬁdence intervals are very small ( 1\% of sample mean) and

would clutter the ﬁgures.

1) Simulations With Optimal Smoothing: Fig. 1

gives the curves and simulation curves, ob-

tained with and , for the ﬁve

video sequences. The simulation curves are,

respectively, named as SIM-G16B3-H.264-unsm for un-

smoothed H.264/AVC streams with GoP structure G16-B3,

SIM-G16B15-SVC-unsm for unsmoothed H.264 SVC streams

with GoP structure G16-B15,SIM-G16B3-MP4-unsm for un-

smoothed MPEG-4 Part 2 streams with GoP structure G16-B3,

SIM-G16B3-H.264-48KB for optimally smoothed H.264/AVC

streams, SIM-G16B15-SVC-48KB for optimally smoothed

H.264 SVC streams, and ﬁnally SIM-G16B3-MP4-48KB for

optimally smoothed MPEG-4 Part 2 streams. For reference, we

plot the curves corresponding to the multiplexing of

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 545

Fig. 1. simulation and curves for ﬁve long CIF sequences encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4 Part 2

(G16-B3). The channel capacity is and the bit loss probability is . curves are provided for unsmoothed and optimally

smoothed trafﬁc with client buffer size 48 KB . Perfect CBR curves are included for comparison: (a) Silence of the Lambs; (b) Star Wars IV;

perfect constant bit rate trafﬁc, denoted by . We deﬁne

PCBR video trafﬁc as the sequence of frame size values that are

equal to the average frame size of the video stream. Hence, the

rate variability of a PCBR video stream is zero and is com-

puted by dividing by the stream’s average bit rate, resulting

in the theoretical maximum value for .

The values for the unsmoothed streams are strongly af-

fected by the rate variability of the video trafﬁc. Toillustrate this

effect, we compare the curves of the unsmoothed trafﬁc

with those of the PCBR video trafﬁc. The unsmoothed trafﬁc

clearly results in fewer supported streams than the PCBR video

trafﬁc, which is only attributable to the rate variability. In ad-

dition, the gap between the PCBR curves of the H.264

SVC and the H.264/AVC encodings is much wider than the gap

between the corresponding unsmoothed trafﬁc curves, e.g., see

Fig. 1(a) and (b). This is also evidence of the profound impact of

the rate variability increase of H.264 SVC trafﬁc on com-

pared to H.264/AVC trafﬁc.

Very interesting is that for the Sony Demo (Fig. 1(c)) and NBC

12 News (Fig. 1(d)) sequences, which have relatively high tex-

ture and motion complexity, the curve of the unsmoothed

H.264 SVC trafﬁc is below the curve of the H.264/AVC trafﬁc.

This is a very important observation, since this means that the

RD efﬁciency gain of H.264 SVC is completely canceled out by

the associated increased rate variability. For very high quality

(38 dB), the H.264 SVC curve for the Sony Demo

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

546 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

Fig. 2. simulation and curves for the Sony Demo and NBC 12 News sequences encoded with H.264 SVC (G16-B15), H.264/AVC (G16-B3),

and MPEG-4 Part 2 (G16-B3) for unsmoothed trafﬁc and for optimally smoothed trafﬁc . The channel capacity is , the bit loss

probability is . Perfect CBR curves are included for comparison: (a) Sony Demo; (b) NBC 12 News.

Fig. 3. Minimum channel capacity simulation results for the Silence of the Lambs and NBC 12 News sequences encoded with H.264/AVC (G16-B3), H.264

SVC (G16-B15), and MPEG-4 Part 2 (G16-B3) for unsmoothed video trafﬁc. The bit loss probability is and the numbers of streams are , 16, and

64: (a) Silence of the Lambs; (b) NBC 12 News.

sequence even approaches the MPEG-4 Part 2 curve, and

surprisingly, for the NBC 12 News sequence the H.264 SVC

curve is below the MPEG-4 Part 2 curve. The reason

is that for these two relatively complex sequences, the number of

streams that can be supported by the link is small ( 20 streams)

and as a result the statistical multiplexing effect that copes with

the rate variability of the streams is reduced.

Next, we study whether trafﬁc smoothing would bring out

the gains in the number of supported streams that one

would expect from the RD efﬁciency gains of H.264 SVC over

H.264/AVC. We initially employ optimal smoothing with a

client buffer size of 48 KB. We observe that all curves

for the optimally smoothed trafﬁc in Fig. 1 have signiﬁcantly

increased values compared to the values for the unsmoothed

trafﬁc, and that they are much closer to the theoretical maximum

values given by the PCBR curves. (In additional experiments

with the Sony sequence, we found that optimal smoothing

with a larger, 128 KB buffer increases by one to ﬁve

streams; generally, for very large smoothing buffers the PCBR

is approached [71].) When examining the gaps between the

curves of H.264 SVC and H.264/AVC, we notice that

the gaps have increased and approach the theoretical max-

imum gaps of the PCBR curves or equivalently the maximum

gain in number of supported streams. We conclude from this

initial analysis that optimal smoothing effectively mitigates

the effects of the increased variability of H.264 SVC trafﬁc

on the maximum number of streams supported in a bufferless

statistical multiplexer. Interesting is that for the relatively lower

complexity (texture, motion) Silence of the Lambs,Star Wars

4, and Tokyo Olympics sequences, the curves of the

smoothed MPEG-4 Part 2 trafﬁc approach the curves of

the unsmoothed H.264 SVC and H.264/AVC trafﬁc in the very

high quality region. For the relatively higher complexity Sony

Demo and NBC 12 News sequences, the curves of the un-

smoothed H.264 SVC and H.264/AVC trafﬁc are considerably

below the curves of the smoothed MPEG-4 Part 2 trafﬁc.

The above observations are clearly dependent on the video

content, but also on the chosen link capacity . Clearly, if the

link can only support a small number of streams, then the statis-

tical multiplexing effect is small, resulting in a strong impact of

the rate variability on the number of multiplexed streams. The

impact is particularly signiﬁcant when multiplexing high quality

H.264 SVC encodings of the relatively complex Sony Demo and

NBC 12 News sequences in the scenario with

considered in Fig. 1. In order to examine the statistical multi-

plexing of these two sequences with a higher link capacity, we

plot in Fig. 2 curves for and .

First, we observe that the values are much larger than for

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 547

Fig. 4. Minimum channel capacity simulation results for the Sony Demo sequence encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4

Part 2 (G16-B3) for unsmoothed video trafﬁc. The bit loss probabilities are and the number of streams are , 16, and 64: (a)

Sony Demo,; (b) Sony Demo, ; (c) Sony Demo, .

the experiments, as we expected. Second, the

curves for the unsmoothed trafﬁc are closer to the theoret-

ical upper boundary given by the PCBR curves. The optimally

smoothed trafﬁc is particularly close to this theoretical upper

limit, again illustrating that even for large there is still a sig-

niﬁcant impact of smoothing on the values. Nevertheless,

in both cases, unsmoothed and smoothed, H.264 SVC clearly al-

lows for more statistically multiplexed streams than H.264/AVC

and MPEG-4 Part 2.

Although the simulations provide insight into the sig-

niﬁcant effects of the increased rate variability of H.264 SVC,

they are dependent on the prescribed link capacity and re-

sult in varying numbers of multiplexed streams (i.e., varying

levels of statistical multiplexing) across the range of average

PSNR video qualities. Therefore, in the next section we per-

form a second set of simulations that estimate the minimum link

capacity required for supporting a prescribed number of

streams . These simulations allow us to study the effects

of the rate variability for a ﬁxed number of multiplexed streams

across the range of PSNR video qualities.

2) Simulations With Optimal Smoothing: Fig. 3 depicts

the curves for unsmoothed trafﬁc of the sequences Silence

of the Lambs and NBC 12 News for multi-

plexed streams for . In general, for , we ob-

serve that the values are somewhat lower for the H.264

SVC streams than for H.264/AVC streams. This link capacity

difference is particularly signiﬁcant for Silence of the Lambs

in the high quality range ( 35 dB), otherwise the differ-

ences become relatively small. However, both encoders have a

clear advantage over MPEG-4 Part 2. For , the statis-

tical multiplexing effect is less able to compensate for the bit

rate variabilities. Overall, the H.264/AVC streams are accom-

modated by values that are smaller than or nearly equal

to the values for the H.264 SVC streams, despite the higher av-

erage bit rates of the H.264/AVC streams. H.264 SVC still out-

performs MPEG-4 Part 2 over the entire quality range.

For , the increased rate variability of H.264 SVC results

in values that are overall comparable to those of multi-

plexed MPEG-4 Part 2 streams. For the Silence of the Lambs

sequence, we observe the surprising result that H.264 SVC re-

quires the highest values over the entire quality range and

MPEG-4 Part 2 even outperforms H.264/AVC below 38 dB. For

the NBC 12 News sequence, H.264 SVC has worst performance

in the quality range above 35 dB. The conclusion is that for a rel-

atively small number of multiplexed streams ( 16), H.264/AVC

generally results in lower requirements, while depending

on the video sequence, H.264 SVC can even be outperformed

by MPEG-4 Part 2 streams.

Next, we examine the impact of the information loss

probability on in Fig. 4. The unsmoothed Sony

Demo streams are multiplexed with maximum losses

, respectively, for , 16,

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

548 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

Fig. 5. simulation results for ﬁve long CIF sequences encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4 Part 2 (G16-B3) for

optimally smoothed video trafﬁc with a 48KB client buffer. The bit loss probability is and the number of streams are , 16, and 64: (a)

Silence of the Lambs; (b) Star Wars IV; (c) Sony Demo; (d) NBC 12 News; (e) Tokyo Olympics.

and 64 streams. The values are signiﬁcantly lower when

the allowable losses are larger, as we expected, and this is the

case for all encoders and numbers of streams . Interesting is

that overall the relative order of the curves, corresponding

to the different encoders for each value of , is preserved.

In Fig. 5, we examine the values for optimally smoothed

streams (client buffer size 48 KB). Overall, optimally smoothed

H.264 SVC trafﬁc has lower values for , 16, and 64

over the entire quality range. The quality range above 35 dB is

particularly favorable for optimally smoothed H.264 SVC over

H.264/AVC. Optimally smoothed MPEG-4 Part 2 trafﬁc clearly

requires substantially more network bandwidth resources.

In summary, we conclude from the and simula-

tions with optimally smoothed trafﬁc that optimally smoothed

H.264 SVC streams clearly have an advantage over optimally

smoothed H.264/AVC and MPEG-4 Part 2 streams. In partic-

ular, the simulations indicate that close to optimal results

(PCBR) are achievable with optimally smoothed trafﬁc. Optimal

smoothing [10], [11] is an off-line technique designed for prere-

corded video streams. Optimal smoothing can been adapted for

live video through appropriate trafﬁc descriptors and predictors,

which have so far only been examined for MPEG-4 Part 2 and

preceding MPEG codecs [72]. Researching appropriate trafﬁc

descriptors and predictors for the new H.264/AVC and H.264

SVC encoders with their more bursty trafﬁc is an open problem.

On the other hand, basic smoothing, which is computationally

signiﬁcantly less complex than optimal smoothing, can easily be

implemented for live video. We are therefore motivated to com-

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 549

Fig. 6. simulation results for the Silence of the Lambs and Sony Demo sequences encoded with H.264 SVC (G16-B15) and H.264/AVC (G16-B3) for basic

smoothed trafﬁc with aggregation level and for optimally smoothed trafﬁc . The bit loss probability is and the number of streams

are , 16, and 64: (a) Silence of the Lambs, H.264 SVC (G16-B15); (b) Silence of the Lambs, H.264/AVC (G16-B3); (c) Sony Demo, H.264 SVC (G16-B15);

(d) Sony Demo, H.264/AVC (G16-B3).

pare the bufferless statistical multiplexing performance

of basic smoothing with optimal smoothing.

3) Simulations With Basic Smoothing: Fig. 6 depicts

the curves for the Silence of the Lambs and Sony Demo

H.264 SVC video trafﬁc (G16-B15) that is smoothed with ag-

gregation level (GoP size), and for H.264/AVC video

trafﬁc (G16-B3) smoothed with . We also include the

results obtained for optimal smoothing. The basic smoothing

curves are only very slightly above the curves for

optimally smoothed trafﬁc. This indicates that basic smoothing

with is almost as effective as optimal smoothing in re-

ducing the rate variability for efﬁcient bufferless statistical mul-

tiplexing.

4) Basic Smoothing Delay Implications: The simulation re-

sults in the preceding sections together with the delay analysis in

the Appendix establish a reference framework for evaluating the

trafﬁc smoothing versus delay trade-off. In this section, we in-

vestigate the choice of the basic smoothing parameters that en-

sure that (i) the link capacity requirements for H.264 SVC trafﬁc

(hierarchical B frames) are reduced compared to H.264/AVC

trafﬁc (classical B frames), and (ii) the link capacity required

with basic smoothing closely approaches the link capacity re-

quired with optimally smoothed trafﬁc.

Fig. 7 depicts simulation curves for unsmoothed and

smoothed (basic) trafﬁc with aggregation levels ,4,8,

and 16. The experiments cover the ﬁve sequences that are en-

coded with H.264 SVC (G16-B15) and H.264/AVC (G16-B3).

We present illustrative results for , 16, and 64 streams,

while the bit loss probability is restricted to ;wehave

also analyzed identical experiments with , which we

can not include due to space constraints. Fig. 7(a) and (b) present

the case with multiplexed streams, Fig. 7(c) and (d)for

streams, and Fig. 7(e) and (f) for streams. We

present illustrative results for videos with relatively low texture

and motion complexity in Fig. 7(a), (c), and (e), while illustra-

tive results for videos with relatively high texture and motion

complexity are presented in Fig. 7(b), (d), and (f).

For streams, in general, the unsmoothed H.264

SVC streams require smaller values than the H.264/AVC

streams. This is explained by the relatively large number of

streams that are statistically multiplexed. Ideally, the

values should be close to the values for optimally

smoothed streams or, equivalently, close to the values

for basic smoothed streams with aggregation , which

is the GoP size, as we illustrated in Fig. 6. In Fig. 7(a), for

example, the simulation curves for increasing aggregation

levels approach the curves of the trafﬁc smoothed with

(which gives very close to optimal smoothing results).

This observation holds for all test sequences and numbers of

multiplexed streams , although for the Sony Demo sequence

the convergence is slower than for the other four sequences.

Overall, when the aggregation level should be set

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

550 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

Fig. 7. simulation results for unsmoothed and basic smoothed trafﬁc with aggregation levels , 4, 8, and 16. The ﬁve sequences are encoded with

H.264 SVC (G16-B15) and H.264/AVC (G16-B3). The bit loss probability is and the number of streams are , 16, and 64: (a) Silence of the Lambs

; (b) Sony Demo ; (c) Star Wars 4 ; (d) NBC 12 News ; (e) Tokyo Olympics ; (f) NBC 12 News .

to or for H.264/AVC stream multiplexing to

approach the optimal performance, and to or for

H.264 SVC streams. The choice between the two values for

each encoder depends on the content type, with the larger value

meant for the most complex sequences.

Analogously, we analyzed the cases with , ,

and multiplexed streams. Table I enumerates aggregation

levels that when applied to both H.264 SVC and H.264/AVC

video streams result in lower requirements for H.264 SVC

streams (G16-B15) than for H.264/AVC streams (G16-B3) for

both examined loss probabilities and . Table II gives

basic smoothing aggregation levels that achieve close to op-

timal smoothing values for H.264/AVC and H.264 SVC,

respectively. For the cases with two values, we recommend

the higher value for sequences with relatively high texture and

motion complexity. The corresponding end-to-end delays, cal-

culated based on the delay analysis in the Appendix, are pro-

vided in Table II for live video streaming (middle two columns)

and for prerecorded video streaming (right two columns).

From this analysis we conclude that the H.264 SVC streams

generally require aggregation levels twice as large as the

H.264/AVC streams to obtain close to optimal statistical multi-

plexing performance. The corresponding end-to-end-delays are

approximately two to three times larger for H.264 SVC than

for H.264/AVC.

The preceding analysis considers one video sequence (out of

the ﬁve sequences) in a given multiplexing experiment. Next,

we examine whether the recommendations for the choice of

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 551

TABLE II

AGGREGATION LEVELS [( FOR LOW COMPL.SEQ.) (FOR HIGH COMPL.SEQ.)] FOR BASIC SMOOTHING SUCH THAT FOR BASIC SMOOTHING VERY

CLOSELY APPROACHES FOR OPTIMAL SMOOTHING FOR H.264/AVC AND H.264 SVC, RESPECTIVELY;C

ORRESPONDING DELAYS [IN FRAME PERIODS]

FOR PRERECORDED AND LIVE VIDEO ARE ALSO PROVIDED.THESE RESULTS APPLY FOR , 16, 32, 64 MULTIPLEXED STREAMS FOR BOTH

AND

TABLE I

AGGREGATION LEVELS FOR BASIC SMOOTHING SUCH THAT FOR

H.264 SVC (G16-B15)ISLESS THAN FOR H.264/AVC (G16-B3)FOR BOTH

AND .WEPROVIDE ( VALUE FOR LOW COMPLEXITY

SEQUENCE)( VALUE FOR HIGH COMPLEXITY SEQUENCE)

the aggregation level also hold for a heterogeneous mix of

the ﬁve video sequences. We organized the H.264/AVC video

streams and the H.264 SVC video streams each into three

quality groups based on average PSNR values: low quality

(32–34 dB), medium quality (35–37 dB), and high quality

(38–40 dB). We conducted multiplexing simulations for each

quality group to determine the minimum link capacities

required to achieve loss probabilities below and

, respectively. In each simulation, we multiplex

streams drawn randomly from the ﬁve video sequences

(while equalizing for the different stream lengths so that each

video sequence is selected with approximately equal proba-

bility). The respective estimated values are reported in

Table III for , and in Table IV for .

From the data in Tables III and IV, we conclude that the

above recommendations for the aggregation levels also hold

for the heterogeneous mix of the video streams; furthermore, the

recommendations hold across quality groups and for both

and . The recommended aggregation levels for

approaching the optimal smoothing value within 15\%, are

to for H.264/AVC streams and to for

H.264 SVC streams. This observation conﬁrms that H.264 SVC

streams require higher aggregation levels to approximate the op-

timal smoothing . We also reconﬁrm that the aggregation

level at which H.264 SVC streams achieve link capacities

below H.264/AVC capacities, is at least and even as high

as . Since these multiplexing experiments with hetero-

geneous video sequences reconﬁrm the aggregation level rec-

ommendations, we conclude that the different encoder conﬁg-

urations, i.e., hierarchical B frames for H.264 SVC (G16-B15)

versus classical B frames for H.264/AVC (G16-B3) are the de-

termining factors in the statistical multiplexing behavior of the

respective video streams.

V. B UFFERED STATISTICAL MULTIPLEXING

Next, we study the buffered statistical multiplexing of video

streams encoded with H.264/AVC (G16-B3), H.264 SVC (G16-

TABLE III

BIT RATES FOR MIXES OF VIDEO STREAMS DRAWN FROM ALL

FIVE VIDEOS FOR DIFFERENT BASIC SMOOTHING LEVELS AND OPTIMAL

SMOOTHING (OPT.S

M.) FOR

TABLE IV

BIT RATES FOR MIXES OF VIDEO STREAMS DRAWN FROM ALL

FIVE VIDEOS FOR DIFFERENT BASIC SMOOTHING LEVELS AND OPTIMAL

SMOOTHING (OPT.SM.) FOR

B15), and MPEG-4 Part 2 (G16-B3). The video trafﬁc is not

smoothed in order to assess the direct impact of the multiplexer

buffer size. The buffer serves the purpose of absorbing some of

the rate variability of the video streams that are multiplexed on

the link. From among the wide range of buffer management and

scheduling policies, see e.g. [73]–[76], we consider the elemen-

tary taildrop policy with ﬁrst-come-ﬁrst-served scheduling, to

assess the fundamental impact of the multiplexer buffer. Specif-

ically, with given in (3) denoting the aggregate bit rate [in

bit/s] of the ongoing video streams in frame period ,

denoting the buffered video trafﬁc [in bit] at the end of the pre-

ceding frame period (i.e., at the beginning of frame period

), and noting that trafﬁc is served at bit rate , the amount of

buffered video trafﬁc at the end of frame period is obtained

(5)

where denotes the buffer capacity [in bit]. The amount of lost

video bits during frame period is given by

and the expected long run fraction of lost bits gives

the information loss probability, which is required to be less than

Fig. 8 depicts simulation results for the ﬁve CIF se-

quences. The channel capacity is and .

Curves are presented for buffer sizes 24, 192, and 3840 KB.

(We also examined the buffer sizes 48 and 96 KB, which

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

552 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

Fig. 8. buffered multiplexing simulation results for ﬁve long CIF sequences encoded with H.264/AVC (G16-B3), H.264 SVC (G16-B15), and MPEG-4

Part 2 (G16-B3). The channel capacity is and the bit loss probability is . curves are provided for unsmoothed trafﬁc. Curves are

presented for buffer sizes are set to 24, 192, and 3840 KB. The bufferless multiplexing results are included for reference: (a) Silence of the Lambs; (b) Star Wars

IV; (c) Sony Demo; (d) NBC 12 News; (e) Tokyo Olympics.

are not included to avoid clutter in the plots.) The bufferless

multiplexing results are depicted for comparison. Analogous to

the minimum channel capacity experiments, we determine the

buffer size that gives near optimal statistical multiplexing re-

sults for H.264 SVC, H.264/AVC, and MPEG-4 Part 2 streams,

whereby we adopt as benchmark for optimal results the

curve for the largest buffer size 3840 KB. Comparisons of

the results in Figs. 1 and 8 indicate that the curve for

3840 KB is very close to the PCBR curve, which gives the

maximum number of streams that can be supported on the

link. We identify the buffer sizes that result in values

that are relatively close to the values for buffer size 3840

KB. The recommended buffer size ranges for each encoder

are summarized in Table V. We determine the buffer ranges

across the ﬁve video sequences, with the largest buffer sizes

corresponding to complex sequences. The H.264 SVC streams

require approximately twice the buffer size compared to the

H.264/AVC streams, which in turn require about double the

buffer size required for MPEG-4 Part 2 streams. With the delay

analysis presented in the Appendix, we obtain a delay of 25

frame periods for transmitting unsmoothed live H.264 SVC

video over a transmit path with a single buffer stage with 192

KB compared to 9 frame periods for transmitting H.264/AVC

video over a transmit path with a single 96 KB buffer stage.

We similarly studied the case when ; the corre-

sponding plots are not included due to space constraints. For

, the recommended buffer size ranges are signiﬁcantly

smaller (approx. half) for each encoder than for .

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 553

Fig. 9. Delay analysis of classical B frame H.264/AVC encoding with GoP structure G16-B3 for no smoothing and for basic smoothing with : (a) no

smoothing, ; (b) basic smoothing with .

Fig. 10. Delay analysis of hierarchical B frame H.264 SVC encoding with GoP structure G16-B15 for no smoothing, and for basic smoothing with : (a) no

smoothing, ; (b) basic smoothing with .

TABLE V

OVERVIEW OF RECOMMENDED BUFFER SIZE RANGES FOR BUFFERED

STATISTICAL MULTIPLEXING WITH

However, the double buffer size relationship between en-

coders remains, as well as the corresponding delay differences.

We conclude that the RD efﬁciency improvements between

TABLE VI

DELAYS [IN FRAME PERIODS]FOR LIVE H.264/AVC (G16-B3)AND H.264

SVC (G16-B15)S

TREAMS

the encoders comes at the price of increased buffer sizes and

corresponding delays in the buffered statistical multiplexing

scenario.

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

554 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

TABLE VII

DELAYS [IN FRAME PERIODS]FOR PRERECORDED H.264/AVC (G16-B3)AND

H.264 SVC (G16-B15)S

TREAMS.T

HE DELAY WITH OPTIMAL SMOOTHING

WITH (ADDITIONAL)STAR T-UPDELAY ISIDENTICAL TO THE

DELAY FOR UNSMOOTHED TRAFFIC

VI. CONCLUSIONS

We have examined the statistical multiplexing behavior of

H.264 SVC, H.264/AVC, and MPEG-4 Part 2 encoded video

with long video sequences. In particular, we have considered the

bufferless statistical multiplexing of smoothed video streams

and the buffered statistical multiplexing of unsmoothed video

streams. We have found that off-line optimal smoothing ensures

that the RD efﬁciency gains of H.264 SVC with hierarchical

B frames over H.264/AVC with classical B frames translate

into commensurate gains in the number stream supported with

statistical multiplexing. (Without smoothing, the higher rate

variability of H.264 SVC may actually result in fewer supported

streams than with the less RD efﬁcient H.264/AVC and in some

scenarios even fewer SVC streams than with the even less RD

efﬁcient MPEG-4 Part 2.) We further examined basic smoothing

which averages the sizes of blocks of successive video frames

and is thus simple to implement in on-line fashion and readily

applicable to live video. We characterized the trade-off between

increased delay with increased levels of smoothing (for larger

) and the resulting reduced rate variability and corresponding

increased number of supported streams with statistical multi-

plexing. Speciﬁcally, we identiﬁed the basic smoothing levels

that ensure that (i) more H.264 SVC than H.264/AVC streams

are supported with statistical multiplexing, and that (ii) the

number of H.264 SVC streams and H.264/AVC streams sup-

ported with basic smoothing closely approaches the number

of streams supported with optimal smoothing. Moreover, we

identiﬁed the sizes of the multiplexer buffers that ensure that

the numbers of supported H.264 SVC streams and H.264/AVC

streams approach the theoretical maximum given by the link

capacity divided by the average stream bit rate; we found that

H.264 SVC requires roughly twice the multiplexer buffer of

H.264/AVC, which in turn requires twice the buffer of MPEG-4

Part 2.

There are numerous directions for future research on the

statistical multiplexing of H.264 SVC and H.264/AVC encoded

video. One important direction is examining collaborative

smoothing strategies and active buffer management strategies

considering the frame playout deadlines for H.264/AVC and

H.264 SVC encoded video.

APPENDIX

DELAY ANALYSIS OF SMOOTHED TRANSMISSION OF H.264

SVC AND H.264/AVC VIDEO

In this Appendix we analyze the end-to-end delay introduced

by the video encoding and decoding in conjunction with the

smoothing of the video frame sizes for network transport. We

initially consider live video and evaluate the time shift between

the capture of a frame at the sender and the display of the frame

at the receiver; we subsequently examine prerecorded video.

Throughout, we normalize time by the frame period (33 ms for

NTSC video). (For all delays reported in units of frame periods,

the corresponding delays in units of seconds are obtained by di-

viding the delay in units of frame periods by the frame rate

in units of frames/second, which is frames/second for

NTSC video.) In general, the time shift between frame cap-

ture and display can be decomposed into the following compo-

nents:

•: Delay introduced due to the dependencies of the

encoded frames, i.e., maximum delay a given captured

frame experiences due to waiting for the capture of subse-

quent frames that are needed for the encoding of the given

captured frame.

•: Delay introduced by the computations needed

for the encoding.

•: Delay introduced by the smoothed transmission.

•: Delay introduced by the computations needed

for the decoding of a frame.

•: Delay introduced by reordering of frames to ensure

uninterrupted display sequence.

The total end-to-end delay is obtained by summing the delay

components

(6)

For each of the following delay analyzes we initially suppose

that the encoding computations and the decoding computations

take one frame period, i.e., , we sub-

sequently consider the cases when computation times become

negligible. Throughout, we suppose that it takes one frame pe-

riod to transmit one (unsmoothed) frame, and frame periods to

transmit a block of smoothed frames, as is consistent with the

evaluation of the aggregate bit rate in (3). We note that the trans-

mission of unsmoothed video is equivalent to basic smoothing

with the aggregation level of one frame, i.e., . We let ,

, denote the number of B frames between successive key

picture (I or P frames).

A. Live Video With Classical B Frames

Fig. 9 illustrates the delay structure for live streaming of

H.264/AVC video encoded with classical B frames for GoP

structure G16-B3, i.e., . The capture time index axis

represents the frame type (I, P, or B) that is used to encode each

captured frame. Each frame is designated by its frame type and

its capture time, e.g., is the frame captured at time index

four and is encoded as a P frame. We suppose that the capture

time itself is inﬁnitesimally short and negligible. On the encode

time axis, the frames are put in encoding order according of the

motion compensated prediction frame dependencies, which are

indicated by the arrows above the capture time axis. The time

shift between the capture and encode axes represents the delay

due to the encoding dependencies . Speciﬁcally, we

observe that , since frame needs to wait

for the capture of frame before frame can be encoded.

The time shift between the encode and transport time axes

represents the delay due to encoding computations .

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 555

For example, the frame is encoded in between time indices

and , followed by the encoding of the frames

, , and that depend on and . In general, with

smoothed transmission, all frames of a smoothing block need

to be encoded before transmission of the block can commence,

hence .

Subsequently, the encoded frames are transmitted in encoding

order since the decoder needs the frames in encoding order for

the decoding process to run without introducing unnecessary re-

ordering delays. In the illustrated unsmoothed example, frame

is transmitted between time indices and ,

while for the illustrated smoothed transmission example, the

ﬁrst block of frames is transmitted between time indices

and . Generally, noting that the decoding can only

start when the entire block of frames is received, we obtain

, which is represented by the time shift between

the transport and decode axes in the illustration in Fig. 9. We do

not consider store-and-forward transmission delays nor propa-

gation, queueing, or processing delays in the transport network;

these delays could be subsumed in in straightforward

fashion. In particular, for buffered multiplexing of unsmoothed

video , as considered in Section V, the transmission

delay in frame periods with a single buffered multiplexing stage

on the transmission path is bounded by one frame period (for

the transmission by the sending host) plus the maximum buffer

delay, namely the buffer capacity [in bit] divided by the bit

rate [in bit/s] normalized with the frame rate F [in frames/s],

i.e., .

Next, the decoder processes frame in between time indices

and in Fig. 9(a); generally, .In

addition, the receiver needs to reorder the decoded frames into

display order to ensure uninterrupted playback. This reordering

introduces one frame period delay, i.e., , since frame

in Fig. 9(a) is not available for display until time instant .

In summary, we obtain for live video with classical B frames

(7)

We remark that we have not included the ﬁrst I frame in

the data blocks for basic smoothing. Alternatively, this I frame

can be included and the non-overlapping blocks would shift

one frame index to the left without any implications for the

end-to-end delay. The advantage of not including the ﬁrst I

frame is that the ﬁrst block already contains a large P frame.

Singling out the ﬁrst I frame allows for spreading its trans-

mission over multiple frame periods if the I frame is encoded

immediately when it is captured. For example, in Fig. 9(a) the

ﬁrst I frame can be transmitted over four frame periods, if it is

immediately encoded after time index zero.

We brieﬂy adapt the above delay analysis to scenarios with

negligible encoding and/or decoding computation times as fol-

lows. We focus on scenarios where either or

is an integer. If an arbitrary number of video frames can be en-

coded in negligible time, , then the delay due to

frame encoding dependencies becomes

. To see this, note that two conditions need to be met before

transmitting the ﬁrst block of encoded frames: (i) the ﬁrst B

frame needs to await the capture of the successive P frame, i.e.,

for frame periods, and (ii) the ﬁrst frame to be transmitted in

a smoothing block needs to await the capture of the remaining

frames for the block. If an arbitrary number of video frames

can be decoded in negligible time, , then the

display reordering delay becomes , where

denotes the indicator function which is one if is true,

and zero otherwise.

B. Live Video With Hierarchical B Frames

We consider hierarchical B frames with a dyadic structure,

i.e., B frames between key pictures for some integer

. We do not consider low-delay or constrained delay B

frame prediction structures [2].

Reasoning as above, along the illustration in Fig. 10, we ﬁnd

that the delay components , , , and

are identical to the above case of classical B frames.

Note however the hierarchical B frame dependency structure,

which is indicated with arrows above the capture time axis, and

the encoding order of the frames on the encode time axis, which

results in minimal reordering delay for the display process [77].

Importantly, we note that due to the hierarchical dependencies

between B frames, the reordering delay for achieving the display

sequence depends on the number of temporal levels, i.e.,

. In summary,

(8)

In Table VI, we summarize the delays for the H.264 SVC

(G16-B15) and H.264/AVC (G16-B3) streams considered in this

study. The end-to-end delays for the H.264 SVC trafﬁc are 15

frame periods larger than for H.264/AVC, which is attributable

to the hierarchical B frame prediction structure. In particular,

with the G16-B15 hierarchical B prediction structure, which re-

sults in improved RD performance, the encoder has to wait until

the frame with time index 16 is captured before it can encode

this frame as an I frame and start encoding all 15 preceding hi-

erarchical B frames. In addition, the reordering delay increases

to four frame periods with the considered RD efﬁcient hierar-

chical B frame structure.

For scenarios with , as well as either or

an integer, we adapt the preceding analysis as fol-

lows. With negligible encoding time, , the en-

coding dependency delay becomes ,

similar to the case of classical B frames. For negligible decoding

time, , the smoothed transmission and display re-

ordering delay become together

(9)

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

556 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

C. Prerecorded Video

For prerecorded video, all frames are preencoded, leaving

only the smoothed transport, decoding, and display reordering

delays, i.e., and

(10)

Effectively, for prerecorded video, the time index is

shifted to the beginning of the frame sequence on the transport

axis in the illustrations in Figs. 9 and 10, e.g., to in Fig. 9.

Speciﬁcally, we obtain for classical B frames

(11)

and for hierarchical B frames

(12)

Table VII gives the delays for prerecorded H.264/AVC

(G16-B3) and H.264 SVC (G16-B15) streams. The end-to-end

delays for the H.264 SVC trafﬁc are three frame periods larger

than for H.264/AVC, which is a smaller difference than for live

video in Table VI.

The delays for optimal smoothing of prerecorded video with

the (additional) startup delay of frame periods (deﬁned in [10])

are obtained by replacing by in (11) and (12). This is

because optimal smoothing is designed to deliver the ﬁrst frame

within frame periods to the decoder; and then ensure that

for each subsequent frame period the next frame is available for

decoding. For the examples in Table VII, the delay for opti-

mally smoothed prerecorded trafﬁc is one to ﬁfteen frame pe-

riods smaller than for basic smoothed prerecorded trafﬁc; op-

timal smoothing is however much more computationally de-

manding than basic smoothing.

ACKNOWLEDGMENT

We are grateful to Prof. Lina Karam, Arizona State Univer-

sity, for insightful discussions on video coding, and to Pras-

anth T. David for developing the automated scheduler of the

encoding jobs.

REFERENCES

[1] H.-C. Huang, W.-H. Peng, T. Chiang, and H.-M. Hang, “Advances in

the scalable amendment of H.264/AVC,” IEEE Communications Mag-

azine, vol. 45, no. 1, pp. 68–76, Jan. 2007.

[2] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable

video coding extension of the H.264/AVC standard,” IEEE Trans. Cir-

cuits and Systems for Video Technology, vol. 17, no. 9, pp. 1103–1120,

Sep. 2007.

[3] M. Wien, H. Schwarz, and T. Oelbaum, “Performance analysis of

SVC,” IEEE Trans. Circuits and Systems for Video Technology, vol.

17, no. 9, pp. 1194–1203, Sep. 2007.

[4] D. Marpe, T. Wiegand, and G. Sullivan, “The H.264/MPEG–4 ad-

vanced video coding standard and its applications,” IEEE Communi-

cations Magazine, vol. 44, no. 8, pp. 134–143, Aug. 2006.

[5] G. Van der Auwera, P. T. David, and M. Reisslein, “Trafﬁc charac-

teristics of H.264/AVC variable bit rate video,” IEEE Communications

Magazine, vol. 46, no. 11, pp. 164–174, Nov. 2008.

[6] T. Lakshman, A. Ortega, and A. Reibman, “VBR video: Tradeoffs and

potentials,” Proceedings of the IEEE, vol. 86, no. 5, pp. 952–973, May

1998.

[7] A. R. Reibman and M. T. Sun, Compressed Video over Networks.

New York: Marcel Dekker, 2000.

[8] D. Wu, Y. Hou, W. Zhu, Y.-Q. Zhang, and J. Peha, “Streaming video

over the internet: Approaches and directions,”IEEE Trans. Circuits and

Systems for Video Technology, vol. 11, no. 3, pp. 282–300, Mar. 2001.

[9] G. Van der Auwera, P. T. David, and M. Reisslein, “Trafﬁc and

quality characterization of single-layer video streams encoded with

the H.264/MPEG–4 Advanced Video Coding standard and Scalable

Video Coding extension,” IEEE Trans. Broadcasting, vol. 54, no. 3,

pp. 698–718, Sep. 2008.

[10] J. Salehi, Z.-L. Zhang, J. Kurose, and D. Towsley, “Supporting stored

video: Reducing rate variability and end–to–end resource requirements

through optimal smoothing,” IEEE/ACM Trans. Networking, vol. 6, no.

4, pp. 397–410, Aug. 1998.

[11] A. R. Reibman and A. W. Berger, “Trafﬁc descriptors for VBR video

teleconferencing over ATM networks,” IEEE/ACM Trans. Networking,

vol. 3, no. 3, pp. 329–339, Jun. 1995.

[12] G. Van der Auwera, M. Reisslein, and L. J. Karam, “Video texture

and motion based modeling of rate variability-distortion (VD) curves,”

IEEE Trans. Broadcasting, vol. 53, no. 3, pp. 637–648, Sep. 2007.

[13] A. Ortega and K. Ramachandran, “Rate-distortion methods for image

and video compression,” IEEE Signal Processing Magazine, vol. 15,

no. 6, pp. 23–50, Nov. 1998.

[14] P. Seeling and M. Reisslein, “The rate variability-distortion (VD)

curve of encoded video and its impact on statistical multiplexing,”

IEEE Trans. Broadcasting, vol. 51, no. 4, pp. 473–492, Dec. 2005.

[15] A. Alheraish, S. Alshebeili, and T. Alamri, “A GACS modeling ap-

proach for MPEG broadcast video,” IEEE Trans. Broadcasting, vol.

50, no. 2, pp. 132–141, Jun. 2004.

[16] N. Ansari, H. Liu, Y. Q. Shi, and H. Zhao, “On modeling MPEG video

trafﬁcs,” IEEE Trans. Broadcasting, vol. 48, no. 4, pp. 337–347, Dec.

2002.

[17] D. P. Heyman and T. V. Lakshman, “Source models for VBR broadcast

video trafﬁc,” IEEE/ACM Trans. Networking, vol. 4, no. 1, pp. 40–48,

Jan. 1996.

[18] X.-D. Huang, Y.-H. Zhou, and R.-F. Zhang, “A multiscale model for

MPEG-4 varied bit rate video trafﬁc,” IEEE Trans. Broadcasting, vol.

50, no. 3, pp. 323–334, Sep. 2004.

[19] M. M. Krunz and A. M. Makowski, “Modeling video trafﬁc using

input processes: A compromise between Markovian and

LRD models,” IEEE Journal on Selected Areas in Communications,

vol. 16, pp. 733–748, Jun. 1998.

[20] D. Marpe, T. Wiegand, and S. Gordon, “H.264/MPEG-4 AVC Fidelity

Range Extensions: Tools, proﬁles, performance, and application

areas,” in Proc. IEEE Int. Conf. on Image Proc. (ICIP), Sep. 2005, pp.

593–596.

[21] A. Undheim, Y. Lin, and P. Emstad, “Characterization of slice-based

H.264/AVC encoded video trafﬁc,” in Proceedings of Fourth European

Conference on Universal Multiservice Networks (ECUMN), Feb. 2007,

pp. 263–272.

[22] H.-H. Juan, H.-C. Huang, C. Huang, and T. Chiang, “Scalable video

streaming over mobile WiMAX,” in Proceedings of IEEE Int. Sympo-

sium on Circuits and Systems (ISCAS), May 2007, pp. 3463–3466.

[23] P. Li, W. Lin, S. Rahardja, X. Lin, X. Yang, and Z. Li, “Geometrically

determining the leaky bucket parameters for video streaming over con-

stant bit-rate channels,” Signal Processing: Image Communication, vol.

20, no. 2, pp. 193–204, Feb. 2005.

[24] D. T. Nguyen and J. Ostermann, “Congestion control for scalable

video streaming using the scalability extension of H.264/AVC,” IEEE

Journal of Selected Topics in Signal Processing, vol. 1, no. 2, pp.

246–253, Aug. 2007.

[25] T. Ozcelebi, A. Tekalp, and M. Civanlar, “Delay-distortion optimiza-

tion for content-adaptive video streaming,” IEEE Trans. Multimedia,

vol. 9, no. 4, pp. 826–836, Jun. 2007.

[26] M. van der Schaar, Y. Andreopoulos, and Z. Hu, “Optimized scalable

video streaming over IEEE 802.11a/e HCCA wireless networks under

delay constraints,” IEEE Trans. Mobile Computing, vol. 5, no. 6, pp.

755–768, Jun. 2006.

[27] T. Schierl, K. Ganger, C. Hellge, T. Wiegand, and T. Stockhammer,

“SVC-based multisource streaming for robust video transmission in

mobile ad hoc networks,” IEEE Wireless Communications, vol. 13, no.

5, pp. 96–103, Oct. 2006.

[28] A. Puri, X. Chen, and A. Luthra, “Video coding using the

H.264/MPEG-4 AVC compression standard,” Journal of Visual

Communication and Image Representation, vol. 19, no. 9, pp.

793–849, Oct. 2004.

[29] J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira,

T. Stockhammer, and T. Wedi, “Video coding with H.264/AVC: Tools,

performance and complexity,” IEEE Circuits and Systems Magazine,

vol. 4, no. 1, pp. 7–28, First Quarter, 2004.

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

VAN DER AUWERA AND REISSLEIN: STATISTICAL MULTIPLEXING OF H.264/AVC AND SVC VIDEO STREAMS 557

[30] G. Sullivan, P. Topiwala, and A. Luthra, “The H.264/AVC advanced

video coding standard: Overview and introduction to the ﬁdelity range

extensions,” in Proc. of SPIE 5558, Conference on Applications of

Digital Image Processing XXVII, Special Session on Advances in

New Emerging Standard: H.264/AVC I, Denver, CO, Aug. 2004, pp.

454–474.

[31] Information Technology–Generic Coding of Audio-Visual Ob-

jects–Part 2: Visual, Final Proposed Draft Amendment 1, ISO/IEC

JTC 1/SC 29/WG 11 N2802, , Geneva, Jul. 1999.

[32] P. Seeling, M. Reisslein, and B. Kulapala, “Network performance eval-

uation with frame size and quality traces of single-layer and two-layer

video: A tutorial,” IEEE Communications Surveys and Tutorials vol. 6,

no. 3, pp. 58–78, Third Quarter, 2004 [Online]. Available: http://trace.

eas.asu.edu, video traces available at

[33] S. Bakiras and V. O. K. Li, “Maximizing the number of users in an

interactive video-on-demand system,” IEEE Trans. Broadcasting, vol.

48, no. 4, pp. 281–292, Dec. 2002.

[34] P. Koutsakis and M. Paterakis, “Policing mechanisms for the transmis-

sion of videoconference trafﬁc from MPEG-4 and H.263 video coders

in wireless ATM networks,” IEEE Trans. Vehicular Technology, vol.

53, no. 5, pp. 1525–1530, 2004.

[35] B. Nikolaus, J. Ott, C. Borrmann, and U. Borrmann, “Generalized

greedy broadcasting for efﬁcient media-on-demand transmissions,”

IEEE Trans. Broadcasting, vol. 51, no. 3, pp. 354–359, 2005.

[36] J. Roberts, “Internet trafﬁc, QoS, and pricing,” Proceedings of the

IEEE, vol. 92, no. 9, pp. 1389–1399, 2004.

[37] Y. Xu and R. Guerin, “Individual QoS versus aggregate QoS: A loss

performance study,” IEEE/ACM Trans. Networking, vol. 13, no. 2, pp.

370–383, 2005.

[38] X.-D. Huang, Y.-H. Zhou, and R.-F. Zhang, “A multiscale model for

MPEG-4 varied bit rate video trafﬁc,” IEEE Trans. Broadcasting, vol.

50, no. 3, pp. 323–334, Sep. 2004.

[39] C. H. Liew, C. K. Kodikara, and A. M. Kondoz, “MPEG-encoded vari-

able bit-rate video trafﬁc modelling,” IEE Proceedings Communica-

tions, vol. 152, no. 5, pp. 749–756, Oct. 2005.

[40] U. K. Sarkar, S. Ramakrishnan, and D. Sarkar, “Modeling full-length

video using Markov-modulated gamma-based framework,” IEEE/ACM

Trans. Networking, vol. 11, no. 4, pp. 638–649, Aug. 2003.

[41] U. K. Sarkar, S. Ramakrishnan, and D. Sarkar, “Study of long duration

MPEG-trace segmentation methods for developing frame size based

trafﬁc models,” Computer Networks, vol. 44, no. 2, pp. 177–188, 2004.

[42] M. Dai and D. Loguinov, “Analysis and modeling of MPEG-4 and

H.264 multi-layer video trafﬁc,” in Proc. of IEEE INFOCOM, Miami,

FL, Mar. 2005, pp. 2257–2267.

[43] D. Fiems, V. Inghelbrecht, B. Steyaert, and H. Bruneel, “Markovian

characterization of H.264/SVC scalable video,” in Proceedings of 15th

Int. Conference on Analytical and Stochastic Modeling Techniques and

Applications (ASMTA), Jun. 2008, Lecture Notes in Computer Science

5055, pp. 1–15.

[44] S. Kempken and W. Luther, “Modeling of H.264 high deﬁnition video

trafﬁc using discrete-time semi-Markov processes,” in Proceedings of

20th Int. Teletrafﬁc Congress (ITC), Jun. 2007, Lecture Notes in Com-

puter Science 4516, pp. 42–53.

[45] C. Bewick, R. Pereira, and M. Merabti, “Network constrained

smoothing: Enhanced multiplexing of MPEG-4 video,” in Pro-

ceedings of IEEE International Symposium on Computers and

Communications, Taormina, Italy, Jul. 2002, pp. 114–119.

[46] H.-C. Chao, C. L. Hung, and T. G. Tsuei, “ECVBA trafﬁc-smoothing

scheme for VBR media streams,” International Journal of Network

Management, vol. 12, pp. 179–185, 2002.

[47] W.-C. Feng and J. Rexford, “Performance evaluation of smoothing al-

gorithms for transmitting prerecorded variable-bit-rate video,” IEEE

Trans. Multimedia, vol. 1, no. 3, pp. 302–312, Sep. 1999.

[48] T. Gan, K.-K. Ma, and L. Zhang, “Dual-plan bandwidth smoothing

for layer-encoded video,” IEEE Trans. Multimedia, vol. 7, no. 2, pp.

379–392, Apr. 2005.

[49] C.-D. Iskander and R. T. Mathiopoulos, “Online smoothing of VBR

H.263 video for the CDMA2000 and IS-95B uplinks,” IEEE Trans.

Multimedia, vol. 6, no. 4, pp. 647–658, Aug. 2004.

[50] M. Krunz, W. Zhao, and I. Matta, “Scheduling and bandwidth allo-

cation for distribution of archived video in VoD systems,” Journal of

Telecommunication Systems, Special Issue on Multimedia, vol. 9, no.

3/4, pp. 335–355, Sep. 1998.

[51] M. Krunz, “Bandwidth allocation strategies for transporting vari-

able–bit–rate video trafﬁc,” IEEE Communications Magazine, vol. 37,

no. 1, pp. 40–46, Jan. 1999.

[52] H. Lai, J. Y. Lee, and L.-K. Chen, “A monotonic-decreasing rate sched-

uler for variable-bit-rate video streaming,” IEEE Trans. Circuits and

Systems for Video Technology, vol. 15, no. 2, pp. 221–231, Feb. 2005.

[53] A. Solleti and K. J. Christensen, “Efﬁcient transmission of stored

video for improved management of network bandwidth,” International

Journal of Network Management, vol. 10, pp. 277–288, 2000.

[54] B. Vandalore, W.-C. Feng, R. Jain, and S. Fahmy, “A survey of applica-

tion layer techniques for adaptive streaming of multimedia,” Real-Time

Imaging Journal, vol. 7, no. 3, pp. 221–235, 2001.

[55] D. Ye, J. Barker, Z. Xiong, and W. Zhu, “Wavelet-based VBR video

trafﬁc smoothing,” IEEE Trans. Multimedia, vol. 6, no. 4, pp. 611–623,

Aug. 2004.

[56] Z. Zhang, J. Kurose, J. Salehi, and D. Towsley, “Smoothing, statistical

multiplexing and call admission control for stored video,” IEEE

Journal on Selected Areas in Communications, vol. 13, no. 6, pp.

1148–1166, Aug. 1997.

[57] Z. Antoniou and I. Stavrakakis, “An efﬁcient deadline-credit-based

transport scheme for prerecorded semisoft continuous media applica-

tions,” IEEE/ACM Trans. Networking, vol. 10, no. 5, pp. 630–643, Oct.

2002.

[58] J. C. H. Yuen, E. Chan, and K.-Y. Lam, “Real time video frames allo-

cation in mobile networks using cooperative pre-fetching,” Multimedia

Tools and Applications, vol. 32, no. 3, pp. 329–352, Mar. 2007.

[59] Y.-W. Leung and T. K. C. Chan, “Design of an interactive video-on-

demand system,” IEEE Trans. Multimedia, vol. 5, no. 1, pp. 130–140,

Mar. 2003.

[60] F. Li and I. Nikolaidis, “Trace-adaptive fragmentation for periodic

broadcast of VBR video,” in Proceedings of 9th International Work-

shop on Network and Operating Systems Support for Digital Audio

and Video (NOSSDAV), Basking Ridge, NJ, Jun. 1999, pp. 253–264.

[61] C.-S. Lin, M.-Y. Wu, and W. Shu, “Transmitting variable-bit-rate

videos on clustered VOD systems,” in Proceedings of IEEE Interna-

tional Conference on Multimedia and Expo (ICME), New York, Jul.

2000.

[62] S. Oh, Y. Huh, B. Kulapala, G. Konjevod, A. W. Richa, and M.

Reisslein, “A modular algorithm-theoretic framework for the fair and

efﬁcient collaborative prefetching of continuous media,” IEEE Trans.

Broadcasting, vol. 51, no. 2, pp. 200–215, Jun. 2005.

[63] S. Oh, B. Kulapala, A. W. Richa, and M. Reisslein, “Continuous-time

collaborative prefetching of continuous media,” IEEE Trans. Broad-

casting, vol. 54, no. 1, pp. 36–52, Mar. 2008.

[64] M. Reisslein and K. W. Ross, “High–performance prefetching proto-

cols for VBR prerecorded video,” IEEE Network, vol. 12, no. 6, pp.

46–55, Nov./Dec. 1998.

[65] S. Racz, T. Jakabfy, J. Farkas, and C. Antal, “Connection admission

control for ﬂow level QoS in bufferless models,” in Proc. IEEE IN-

FOCOM, 2005, pp. 1273–1282.

[66] M. Reisslein and K. W. Ross, “Call admission for prerecorded sources

with packet loss,” IEEE Journal on Selected Areas in Communications,

vol. 15, no. 6, pp. 1167–1180, Aug. 1997.

[67] J. Roberts, U. Mocci, and J. Virtamo, Broadband Network Trafﬁc: Per-

formance Evaluation and Design of Broadband Multiservice Networks,

Final Report of Action COST 242. New York: Springer Verlag, 1996,

vol. 1155, Lecture Notes in Computer Science.

[68] A. Elwalid, D. Heyman, T. Lakshman, D. Mitra, and A. Weiss, “Fun-

damental bounds and approximations for ATM multiplexers with ap-

plications to video teleconferencing,” IEEE Journal on Selected Areas

in Communications, vol. 13, no. 6, pp. 1004–1016, Aug. 1995.

[69] F. Y.-S. Lin, “Optimal real-time admission control algorithms for the

video-on-demand (VOD) service,” IEEE Trans. Broadcasting, vol. 44,

no. 4, pp. 402–408, Dec. 1998.

[70] N. Shroff and M. Schwartz, “Improved loss calculations at an ATM

multiplexer,” IEEE/ACM Trans. Networking, vol. 6, no. 4, pp. 411–421,

Aug. 1998.

[71] J. McManus and K. Ross, “Video-on-demand over ATM: Constant-

rate transmission and transport,” IEEE Journal on Selected Areas in

Communications, vol. 14, no. 6, pp. 1087–1098, Aug. 1996.

[72] S. Sen, J. L. Rexford, J. K. Dey, J. F. Kurose, and D. F. Towsley, “On-

line smoothing of variable-bit-rate streaming video,” IEEE Trans. Mul-

timedia, vol. 2, no. 1, pp. 37–48, Mar. 2000.

[73] Y. Bai and M. Ito, “Application-aware buffer management: New met-

rics and techniques,” IEEE Trans. Broadcasting, vol. 51, no. 1, pp.

114–121, Mar. 2005.

[74] Y. Huang, R. Guerin, and P. Gupta, “Supporting excess real-time trafﬁc

with active drop queue,” IEEE Trans. Networking, vol. 14, no. 5, pp.

965–977, Oct. 2006.

[75] G.-M. Muntean, P. Perry, and L. Murphy, “A new adaptive multi-

media streaming system for all-IP multi-service networks,” IEEE

Trans. Broadcasting, vol. 50, no. 1, pp. 1–10, Mar. 2004.

[76] S. Ryu, C. Rump, and C. Qiao, “Advances in internet congestion con-

trol,” IEEE Communications Surveys and Tutorials, vol. 5, no. 1, pp.

28–39, 2003.

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

558 IEEE TRANSACTIONS ON BROADCASTING, VOL. 55, NO. 3, SEPTEMBER 2009

[77] H. Schwarz, D. Marpe, and T. Wiegand, “Analysis of hierarchical B

pictures and MCTF,” in IEEE Int. Conf. Multimedia and Expo (ICME),

Toronto, Canada, Jul. 2006, pp. 1929–1932.

Geert Van der Auwera received the Ph.D. degree

in Electrical Engineering from Arizona State Univer-

sity, Tempe, USA, in 2007, and the Belgian MSEE

degree from Vrije Universiteit Brussel (VUB), Brus-

sels, Belgium, in 1997. Presently, he is Staff Research

Engineer with Samsung Electronics in Irvine, CA.

His research interests are video coding, video trafﬁc

and quality characterization, video streaming mech-

anisms and protocols. Until the end of 2004, he was

Scientiﬁc Advisor with IWT-Flanders, the Institute

for the Promotion of Innovation by Science and Tech-

nology in Flanders, Belgium. In 2000, he joined IWT-Flanders after researching

wavelet video coding at IMEC’s Electronics and Information Processing De-

partment (VUB-ETRO) in Brussels, Belgium. In 1998, his thesis on motion es-

timation in the wavelet domain received the Barco and IBM prizes by the Fund

for Scientiﬁc Research of Flanders, Belgium.

Martin Reisslein is an Associate Professor in the De-

partment of Electrical Engineering at Arizona State

University (ASU), Tempe. He received the Dipl.-Ing.

(FH) degree from the Fachhochschule Dieburg, Ger-

many, in 1994, and the M.S.E. degree from the Uni-

versity of Pennsylvania, Philadelphia, in 1996. Both

in electrical engineering. He received his Ph.D. in

systems engineering from the University of Pennsyl-

vania in 1998. During the academic year 1994-1995

he visited the University of Pennsylvania as a Ful-

bright scholar. From July 1998 through October 2000

he was a scientist with the German National Research Center for Information

Technology (GMD FOKUS), Berlin and lecturer at the Technical University

Berlin. From October 2000 through August 2005 he was an Assistant Professor

at ASU.

He served as editor-in-chief of the IEEE Communications Surveys and Tuto-

rials from January 2003 through February 2007 and has served on the Technical

Program Committees of IEEE Infocom and numerous other networking con-

ferences. He maintains an extensive library of video traces for network perfor-

mance evaluation, including frame size traces of MPEG-4 and H.264 encoded

video, at http://trace.eas.asu.edu. His research interests are in the areas of In-

ternet Quality of Service, video trafﬁc characterization, wireless networking,

optical networking, and engineering education.

Authorized licensed use limited to: Arizona State University. Downloaded on August 31, 2009 at 13:04 from IEEE Xplore. Restrictions apply.

Energy-Efficient Multipath TCP for Quality-Guaranteed Video Over Heterogeneous Wireless Networks

Article

Nov 2018

Prompted by the technological advancements in wireless systems and hand-held devices, concurrent multipath transfer is a promising solution to stream high-quality mobile video in heterogeneous access medium. Multipath TCP (MPTCP) is the standard transport-layer protocol recommended by IETF (Internet Engineering Task Force) for concurrent data transmission to multi-radio terminals. However, it still remains problematic to provide user-satisfied real-time streaming services with the existing MPTCP schemes due to the tradeoff between energy efficiency and video quality. To effectively stream energy-efficient real-time video, this paper presents a Delay-Energy-quAlity aware MPTCP (DEAM) solution. First, an analytical framework is developed to characterize the delay-constrained energy-quality tradeoff for multipath video delivery over heterogeneous access networks. Second, a subflow allocation algorithm is proposed to minimize the device energy consumption while achieving target video quality within imposed deadline. The performance of the proposed DEAM is evaluated through extensive semi-physical emulations in Exata. Evaluation results demonstrate that DEAM outperforms the reference MPTCP schemes in reducing energy consumption and improving user-perceived video quality.

Streaming High-Definition Real-Time Video to Mobile Devices With Partially Reliable Transfer

Article

Full-text available

May 2018

Delivering High-Definition (HD) real-time video to mobile devices is challenged with stringent constraints in delay and reliability. In the presence of network dynamics (e.g., channel errors and bandwidth fluctuations), the existing error-control mechanisms [e.g., ARQ (Automatic Repeat reQuest) and FEC (Forward Error Correction)] frequently induce deadline violations and quality degradations. To strike an effective balance between delay and reliability in real-time video transmission, this research presents an application-layer solution dubbed PERES (Partial rEliability based Real-timE Streaming) to perform partially reliable transfer. First, we develop an analytical framework to model the delay-constrained partial reliability for ACK (acknowledgement) and NAK (negative acknowledgement) based real-time video streaming. Second, we propose scheduling algorithms for video-aware reliability adaptation and network-adaptive buffer control. PERES is able to maximize the transmission reliability of high-priority video frames within stringent delay constraint. We implement the proposed transmission scheme in embedded video monitoring systems and evaluate the efficacy over different wireless network environments. Evaluation results demonstrate PERES achieves appreciable improvements over the reference schemes in perceptual video quality, delay performance and bandwidth efficiency.

Quality-Aware Energy Optimization in Wireless Video Communication With Multipath TCP

Article

Jun 2017

The advancements in wireless communication technologies prompt the bandwidth aggregation for mobile video delivery over heterogeneous access networks. Multipath TCP (MPTCP) is the transport protocol recommended by IETF for concurrent data transmission to multihomed terminals. However, it still remains challenging to deliver user-satisfied video services with the existing MPTCP schemes because of the contradiction between energy consumption and received video quality in mobile devices. To enable the energy-efficient and quality-guaranteed video streaming, this paper presents an energy-distortion-aware MPTCP (EDAM) solution. First, we develop an analytical framework to characterize the energy-distortion tradeoff for multipath video transmission over heterogeneous wireless networks. Second, we propose a video flow rate allocation algorithm to minimize the energy consumption while achieving target video quality based on utility maximization theory. The performance of the proposed EDAM is evaluated through both experiments in real wireless networks and extensive emulations in exata. Experimental results show that EDAM exhibits performance advantages over existing MPTCP schemes in energy conservation and video quality.

Mobile Cloud Media:

Chapter

Jan 2015

Leveraging Mobile Cloud Computing (MCC), resource-poor mobile devices are now enabled to support rich media applications. In this chapter, the authors briefly review basic concepts and architecture of mobile cloud computing, and focus on the technical challenges of MCC for multimedia applications. Specifically, they discuss how to save energy, ensure Quality of Experience (QoE), deal with stochastic wireless channels, support security and privacy, and reduce network costs for rich media applications. Prototypes, ongoing standardization efforts, and commercial aspects are also reviewed. The authors conclude this chapter with a discussion of several open research problems that call for substantial research and regulation efforts.

Power-Constrained Quality Optimization for Mobile Video Chatting With Coding-Transmission Adaptation

Article

Apr 2020

Mobile video chatting has emerged as an important Internet multimedia application that greatly enriches interpersonal communications. Mobile power efficiency is crucial to the service quality and time of video chatting on battery-limited smartphones. However, the power characteristics of the video coding and data communication are highly complex due to the time-varying network conditions and dynamic mobile energy features. This incurs crucial challenges to maintaining the low power dissipation of mobile chatting application while streaming satisfactory-quality videos. To address these challenges, this paper presents a joinT cOdingtranSmission Optimization (TOSO) protocol at application layer that performs machine learning based adaptation of the video bit rate and FEC (Forward Error Correction) coding parameters. By taking advantage of analytical and empirical models characterizing the quality-power relationship, TOSO is able to maximize video quality subject to a specified upper bound of power consumption in mobile chat application. This distinguishing feature prevents the video chat from draining battery too quickly. Moreover, it allows the smartphone operating system or the mobile user to define a desired video chat duration given the remaining battery, avoiding unpleasant conversation disruption due to battery depletion. Extensive experiments based on the Linphone platform and Exata network emulator show that TOSO outperforms baseline approaches by 29:3% in power conservation while achieving the same video quality level.

Reduction of Padding Overhead for RLNC Media Distribution With Variable Size Packets

Article

Feb 2019

Random linear network coding (RLNC) can enhance the reliability of multimedia transmissions over lossy communication channels. However, RLNC has been designed for equal size packets, while many efficient multimedia compression schemes, such as variable bitrate (VBR) video compression, produce unequal packet sizes. Padding the unequal packet sizes with zeros to the maximum packet size creates an overhead on the order of 20%-50% or more for typical VBR videos. Previous padding overhead reduction approaches have focused on packing the unequal packet sizes into fixed size packets, e.g., through packet bundling or chaining and fragmentation. We introduce an alternative padding reduction approach based on coding macro-symbols (MSs), whereby an MS is a fixed-sized part of a packet. In particular, we introduce a new class of RLNC, namely MS RLNC which conducts RLNC across columns of MSs, instead of the conventional RLNC across columns of complete packets of equal size. Judiciously arranging the source packets into columns of MSs, e.g., through shifting the source packets horizontally relative to each other, supports favorable MS RLNC coding properties. We specify the MS RLNC encoding and decoding mechanisms and analyze their complexity for a range of specific MS arrangement strategies within the class of MS RLNC. We conduct a comprehensive padding overhead evaluation encompassing both previous approaches of packing the unequal size packets into fixed size packets as well as the novel MS RLNC approaches with long VBR video frame size traces. We find that for small RLNC generation sizes that support low network transport delays, MS RLNC achieves the lowest padding overheads; while for large generation sizes, both the previous packing approaches and the novel MS RLNC approaches effectively reduce the padding overhead.

Evaluation of New Dynamic Time Division Multiplexing Protocol for Real-Time Traffic over Converged Networks

Conference Paper

Aug 2018

Optimizing Stored Video Delivery For Mobile Networks: The Value of Knowing the Future

Article

Jun 2018

This paper considers the design of cross-layer opportunistic transport protocols for stored video over wireless networks with a slow varying (average) capacity. We focus on two key principles: (1) scheduling data transmissions when capacity is high; and (2), exploiting knowledge of future capacity variations. The latter is possible when users' mobility is known or predictable, e.g., users riding on public transportation or using navigation systems. We consider the design of cross-layer transmission schedules which minimize system utilization (and thus possibly transmit/receive energy) while avoiding, if at all possible, rebuffering/delays, in several scenarios. For the singleuser anticipative case where all future capacity variations are known beforehand; we establish the optimal transmission schedule is a Generalized Piecewise Constant Thresholding (GPCT) scheme. For the single-user partially anticipative case where only a finite window of future capacity variations is known, we propose an online Greedy Fixed Horizon Control (GFHC). An upper bound on the competitive ratio of GFHC and GPCT is established showing how performance loss depends on the window size, receiver playback buffer, and capacity variability. We also consider the multiuser case where one can exploit both future temporal and multiuser diversity. Finally we investigate the impact of uncertainty in knowledge of future capacity variations, and propose an offline approach as well as an online algorithm to deal with such uncertainty. Our simulations and evaluation based on a measured wireless capacity trace exhibit robust potential gains for our proposed transmission schemes.

Software Defined Applications in Cellular and Optical Networks

Article

Dec 2017

Akhilesh S. Thyagaturu

Small wireless cells have the potential to overcome bottlenecks in wireless access through the sharing of spectrum resources. A novel access backhaul network architecture based on a Smart Gateway (Sm-GW) between the small cell base stations, e.g., LTE eNBs, and the conventional backhaul gateways, e.g., LTE Servicing/Packet Gateways (S/P-GWs) has been introduced to address the bottleneck. The Sm-GW flexibly schedules uplink transmissions for the eNBs. Based on software defined networking (SDN) a management mechanism that allows multiple operator to flexibly inter-operate via multiple Sm-GWs with a multitude of small cells has been proposed. This dissertation also comprehensively survey the studies that examine the SDN paradigm in optical networks. Along with the PHY functional split improvements, the performance of Distributed Converged Cable Access Platform (DCCAP) in the cable architectures especially for the Remote-PHY and Remote-MACPHY nodes has been evaluated. In the PHY functional split, in addition to the re-use of infrastructure with a common FFT module for multiple technologies, a novel cross functional split interaction to cache the repetitive QAM symbols across time at the remote node to reduce the transmission rate requirement of the fronthaul link has been proposed.

Pre-encoding based statistical-multiplexing for hybrid delivery of UHD services using SHVC

Conference Paper

Jan 2016

Compressed Video Over Networks

Article

Full-text available

Arthur Berger

SVC-based multisource streaming for robust video transmission in mobile Ad Hoc Networks

Article

Full-text available

Oct 2006

Emerging noninfrastructure-based network types like mobile ad-hoc networks (MANETs) are becoming suitable platforms for exchanging/sharing real-time video streams, because of recent progress in routing algorithms, throughput and transmission bit-rate. MANETs are characterized by highly dynamic behavior of the transmission routes and path outage probabilities. In this article a multisource streaming approach is presented to increase the robustness of real-time video transmission in MANETs. For that, video coding as well as channel coding techniques on the application layer are introduced, exploiting the multisource representation of the transferred media. Source coding is based on the scalable video coding (SVC) extension of H.264/MPEG4-AVC with different layers for assigning importance for transmission. Channel coding is based on a novel unequal packet loss protection (UPLP) scheme, which is based on Raptor forward error correction (FEC) codes. While in the presented approach, the reception of a single stream guarantees base quality only, the combined reception enables playback of video at full quality and/or lower error rates. Furthermore, an application layer protocol is introduced for supporting peer-to-peer based multisource streaming in MANETs

Rate--Distortion Methods in Image and Video Compression

Article

Full-text available

Nov 1998

In this article we provide an overview of rate-distortion (R-D) based optimization techniques and their practical application to image and video coding. We begin with a short discussion of classical rate-distortion theory and then we show how in many practical coding scenarios, such as in standards-compliant coding environments, resource allocation can be put in an R-D framework. We then introduce two popular techniques for resource allocation, namely, Lagrangian optimization and dynamic programming. After a discussion of these techniques as well as some of their extensions, we conclude with a quick review of literature in these areas citing a number of applications related to image and video compression and transmission

Compressed Video Over Networks

Book

Oct 2018

Ming-Ting Sun

Supporting Excess Real-time Traffic with Active Drop Queue

Article

Aug 2003

The requirements of real-time applications mean that they often stand to benefit from network service guarantees, and in particular delay guarantees. However, most of the mechanisms that provide delay guarantees do so by hard-limiting the amount of traffic the application can generate, i.e., to conform to a traffic contract. This can be a significant constraint that conflicts with the operation of many real-time applications. Our purpose is to propose and investigate solutions that overcome this limitation. Our four major goals are (1) guarantee a delay bound to a contracted amount of real-time traffic; (2) transmit with the same delay bound as many excess real-time packets as possible; (3) enforce a given link sharing ratio between excess real-time traffic and other service classes, e.g., best-effort; (4) preserve the ordering of real-time packets, if required. Our approach is based on a combination of buffer management and scheduling mechanisms. We evaluate its “cost” by measuring the overhead involved in an actual implementation, and we investigate its performance by simulations using video traffic traces.

The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions

Article

Nov 2004
Proceedings of SPIE

H.264/MPEG-4 AVC is the latest international video coding standard. It was jointly developed by the Video Coding Experts Group (VCEG) of the ITU-T and the Moving Picture Experts Group (MPEG) of ISO/IEC. It uses state-of-the-art coding tools and provides enhanced coding efficiency for a wide range of applications, including video telephony, video conferencing, TV, storage (DVD and/or hard disk based, especially high-definition DVD), streaming video, digital video authoring, digital cinema, and many others. The work on a new set of extensions to this standard has recently been completed. These extensions, known as the Fidelity Range Extensions (FRExt), provide a number of enhanced capabilities relative to the base specification as approved in the Spring of 2003. In this paper, an overview of this standard is provided, including the highlights of the capabilities of the new FRExt features. Some comparisons with the existing MPEG-2 and MPEG-4 Part 2 standards are also provided.

Traffic descriptors for VBR video teleconferencing

Article