Conference PaperPDF Available

Graph-based Transforms for Inter Predicted Video Coding

September 2015

September 2015

DOI:10.1109/ICIP.2015.7351555

Conference: IEEE International Conf. on Image Processing (ICIP 2015)

Authors:

Hilmi E. Egilmez

Qualcomm

Amir Said

Qualcomm

Yung-Hsuan Chao

University of Southern California

Antonio Ortega

University of Southern California

In video coding, motion compensation is an essential tool to obtain residual block signals whose transform coefficients are encoded. This paper proposes novel graph-based transforms (GBTs) for coding inter-predicted residual block signals. Our contribution is twofold: (i) We develop edge adaptive GBTs (EA-GBTs) derived from graphs estimated from residual blocks, and (ii) we design template adaptive GBTs (TA-GBTs) by introducing simplified graph templates generating different set of GBTs with low transform signaling overhead. Our experimental results show that proposed methods significantly outperform traditional DCT and KLT in terms of rate-distortion performance.

An overall block diagram for hybrid video coding, using a combination of predictive and transform coding.

…

: Percentage reduction in bitrate (bits/pixel) with respect to average bitrate obtained using DCT.

…

: Graphs (a) connecting each pixel with its four nearest neighboring pixels (4-connected) and (b) connecting each pixel with pixels that are 1-hop away (8-connected). predicted transform coding. To best of our knowledge, our paper is the first work that proposes GBTs for encoding inter-predicted residual blocks by exploiting their statistical characteristics. The rest of the paper is organized as follows. In Section 2 we introduce GBTs. Section 3 discusses inter-predicted residual signal characteristics used in designing proposed GBTs. In Section 4, the proposed EA-GBTs and TA-GBTs are described. The experimental results are presented in Section 5, and Section 6 draws some conclusions based on experimental results.

…

: Sample variance values calculated over 8 × 8 residual blocks

…

Figures - uploaded by Hilmi E. Egilmez

Content may be subject to copyright.

Content uploaded by Hilmi E. Egilmez

Content may be subject to copyright.

GRAPH-BASED TRANSFORMS FOR INTER PREDICTED VIDEO CODING

Hilmi E. Egilmez

†

, Amir Said

∗

, Yung-Hsuan Chao

†

and Antonio Ortega

†

Signal & Image Processing Institute, University of Southern California, Los Angeles, CA, USA

∗

Qualcomm Technologies, San Diego, CA, USA

hegilmez@usc.edu, asaid@qti.qualcomm.com, yunghsuc@usc.edu, ortega@sipi.usc.edu

ABSTRACT

In video coding, motion compensation is an essential tool to ob-

tain residual block signals whose transform coefﬁcients are en-

coded. This paper proposes novel graph-based transforms (GBTs)

for coding inter-predicted residual block signals. Our contribu-

tion is twofold: (i) We develop edge adaptive GBTs (EA-GBTs)

derived from graphs estimated from residual blocks, and (ii) we de-

sign template adaptive GBTs (TA-GBTs) by introducing simpliﬁed

graph templates generating different set of GBTs with low transform

signaling overhead. Our experimental results show that proposed

methods signiﬁcantly outperform traditional DCT and KLT in terms

of rate-distortion performance.

Index Terms— Transform, signal processing on graphs, graph-

based transforms, video coding, video compression.

1. INTRODUCTION

In video coding standards including HEVC [1], inter-prediction is

a very important building block that signiﬁcantly improves coding

efﬁciency by exploiting high temporal redundancy between video

blocks. In general, samples of residual blocks obtained from inter-

prediction have low energy, so their transform coefﬁcients can be

efﬁciently encoded. However, some residual blocks may have high

energy due to high motion activity and occlusions, so that better en-

ergy compacting transforms are needed to improve coding gains.

Typically, in conventional video coding architectures as shown in

Fig. 1, a ﬁxed transform such as discrete cosine transform (DCT) is

employed to accommodate complexity constraints of encoding. The

main problem of using a ﬁxed block linear transform is the implicit

assumption that all residual blocks have the same isotropic statisti-

cal properties. Yet in practice, residual blocks can have very dif-

ferent statistical characteristics depending on video content. Better

compression can be achieved by using different transforms that can

adapt to statistical properties of residual blocks. But, such adapta-

tion requires to encode additional side information, called transform

signaling overhead, that is used by the decoder to identify the trans-

forms used at the encoder. Therefore, it is important to design trans-

forms that adapt common residual block characteristics with low sig-

naling overhead.

This paper presents two different type of transforms exploiting

statistical characteristics of inter-predicted residual blocks. The pro-

posed transforms fall into the category of graph-based transforms

(GBTs) where we ﬁrst design a graph capturing some signal char-

acteristics observed from inter-predicted residual blocks, and asso-

ciated orthogonal transforms are then derived from the designated

graph. In our ﬁrst design, which is edge adaptive GBT (EA-GBT),

This work has been supported in part by LG Electronics.

Delay

Prediction

Inverse

transform

Inverse

transform

Transform

selection

Dequantization

Quantization

Entropy

decoding

Entropy

encoding

Encoder

Decoder

Fig. 1: An overall block diagram for hybrid video coding, using a

combination of predictive and transform coding.

we allow ﬂexible adaptation for each residual block. Firstly, edge de-

tection is performed for each residual block, and based on detected

edges we construct a weighted graph which captures signal variation

characteristics in the block. Then, an EA-GBT is generated using

the weighted graph. Note that, this method can create a large sig-

naling overhead depending on the graph information has to be sent

to the decoder. Our second design proposes template adaptive GBTs

(TA-GBTs) which are derived based on a set of simpliﬁed graph

templates capturing basic statistical characteristics of inter-predicted

residual blocks. Thus, graph information can be efﬁciently sent to

the decoder via signaling indexes of corresponding graph templates.

By selecting different subsets of graph templates, the signaling over-

head can be signiﬁcantly reduced without losing adaptivity, espe-

cially when a few graph templates are sufﬁcient to capture block

signal characteristics.

In the literature, several adaptive transform approaches have

been proposed. Most similar to our work, Shen et.al. [2] propose

edge adaptive transforms (EAT) speciﬁcally for depth map com-

pression. Although our paper adopts some basic concepts originally

introduced in [2] for designing EA-GBTs, our graph construction

method is different. Hu et.al. [3] extends EATs by optimizing weak-

link weights for piecewise smooth image compression. In both [2]

and [3], authors propose methods speciﬁc to depth map compres-

sion, but our work focuses on encoding inter-predicted residual

blocks. Related to inter-predicted coding, Liu and Flierl [4] propose

motion adaptive transforms based on vertex weighted graphs for

coding motion-connected pixels. Their approach is not block based

and in their graph construction, unlike in our work, vertex weights

are adjusted using a measure called motion scale factor. Most of

the related recent works are on intra-predicted adaptive transforms.

In [5], Takamura and Shimizu develop intra-mode dependent KLTs,

and Han et.al. [6] introduce hybrid DCT/ADST transform for intra-

(a) (b)

Fig. 2: Graphs (a) connecting each pixel with its four nearest neigh-

boring pixels (4-connected) and (b) connecting each pixel with pix-

els that are 1-hop away (8-connected).

predicted transform coding. To best of our knowledge, our paper

is the ﬁrst work that proposes GBTs for encoding inter-predicted

residual blocks by exploiting their statistical characteristics.

The rest of the paper is organized as follows. In Section 2 we

introduce GBTs. Section 3 discusses inter-predicted residual signal

characteristics used in designing proposed GBTs. In Section 4, the

proposed EA-GBTs and TA-GBTs are described. The experimental

results are presented in Section 5, and Section 6 draws some conclu-

sions based on experimental results.

2. PRELIMINARIES

In graph signal processing [7, 8], signals are supported on an undi-

rected, weighted and connected graph, G(N , E, A), where signal

values are attached to nodes of the graph (N ) and its links (E) cap-

ture inter-sample relations among signal’s samples. The adjacency

matrix, A, represents the graph’s link weights. For a given graph,

G(N , E, A), we deﬁne graph-based transforms (GBTs) using its

combinatorial Laplacian,

L = D − A (1)

where D is the diagonal degree matrix. In order to ﬁnd the GBT as-

sociated with graph G, we perform eigen-decomposition of the graph

Laplacian, that is

L = TΛT

, (2)

where the columns of T are the basis vectors of the corresponding

GBT. Since L is a real symmetric matrix, it has a complete set of

orthonormal eigenvectors.

A graph is completely deﬁned by an adjacency matrix, so we can

create different transforms by designing graph-link weights (i.e., A).

For example as shown in Fig. 2, an image block can be represented

as a graph so that different connectivity patterns lead to different

interpretations in graph transform domain.

3. INTER-PREDICTED RESIDUAL BLOCK SIGNAL

CHARACTERISTICS

In this section, we investigate some statistical properties of inter-

predicted residual blocks that we consider in our transform designs.

In general, inter-predicted residual block signals have small valued

(low energy) samples because of high temporal redundancy among

video blocks. This is very important for effective compression, since

it leads to sparse quantized coefﬁcients which can be encoded efﬁ-

ciently. However, large prediction errors are possible in case of high

motion activity and occlusions which lead to large transform coefﬁ-

cients requiring more bits for encoding. Based on our observations

on residual block signals obtained using HEVC encoder (HM-14),

residual signal samples that are close to boundaries of the blocks

(a) Harbour (b) Soccer

Fig. 3: Sample variance values calculated over 8 × 8 residual blocks

(a) Harbour (b) Soccer

Fig. 4: Similarity graphs for 8 × 8 residual blocks where partial

correlation values between nearest neighboring pixels are shown.

have larger values mainly because of occlusions leading to partial

mismatches between reference and predicted blocks. Fig. 3 illus-

trates sample variance values calculated over 8 × 8 residual blocks

of Harbour and Soccer sequences

. Note that for both sequences,

sample variance (i.e., energy) is larger around the boundaries and

corners of the residual blocks.

Moreover, Fig. 4(a) and (b) show similarity graphs trained for

8 × 8 inter-predicted residual blocks over Harbour and Soccer video

sequences, respectively. As a measure of inter-pixel similarity, par-

tial correlation values are calculated based on the precision matrix,

J, where J is deﬁned as the inverse of the covariance matrix [9],

calculated for each video sequence. The weighted graphs demon-

strate that the similarity between the pixels near boundaries of a

residual block is smaller compared to the pixels around the center

of the block.

It is important to note that the statistical characteristics of inter-

predicted residuals discussed in this section are not speciﬁc to Har-

bour and Soccer video sequences. According to our experiments,

these characteristics are fairly general and applies to different se-

quences and residual block sizes. These characteristics are exploited

in our GBT designs discussed in the next section.

4. PROPOSED GRAPH-BASED TRANSFORMS

4.1. Edge Adaptive GBT (EA-GBT)

In designing edge adaptive graph based transforms (EA-GBT), we

ﬁrst (i) generate a uniformly weighted graph, then (ii) based on dif-

ferences between pixels (i.e., edges), graph links are pruned or their

weights are adjusted (weakened). By doing this, the transforms as-

sociated to designed graphs can exploit different block signal char-

acteristics and therefore GBTs provide better representation of resid-

ual signals. In particular, we propose to implement following steps

to construct EA-GBTs:

We show statistical properties of Harbour and Soccer sequences, since

both have high motion activity.

1. Based on the size of the residual block of interest, we create

a nearest neighbor (4-connected) graph with link weights all

equal to 1 as shown in Fig. 2(a) for 8 × 8 blocks.

2. Given a residual block, we apply Prewitt operator to calculate

gradient in vertical and horizontal direction.

3. We detect edges based on thresholding on gradient values.

4. Depending on angle value (directionality) of an edge, the

weights of some graph links are reduced.

5. Weak graph link weights can be chosen in the range of [0,1).

Based on our experiments, instead of assigning zero weights

(may lead to disconnected components), small weights pro-

vide better compression. To reduce signaling overhead, we

experimentally select a single weak link weight set to 0.001.

6. After designing a graph, the associated GBT is constructed as

discussed in Section 2.

Fig. 5 illustrates a sample graph design, obtained by the procedure

above, where link weights of the original 4-connected graph is weak-

ened based on the edges observed in a given residual block. Thus,

the transforms associated to constructed graphs can adapt to differ-

ent residual block signals. Although the resulting transforms can

provide efﬁcient coding for transform coefﬁcients, the overall cod-

ing performance may not be sufﬁcient due to signaling overhead

of graph information, especially if multiple weak link weights are

used. To address this problem, we propose to use a single weak link

weight so that an edge-map codec such as arithmetic edge encoder

(AEC) [10] can be employed to efﬁciently send graph information.

In addition, based on our experiments, signaling graph information

for small blocks (e.g., 4 × 4) may result in excessive bit overheads.

In order to efﬁciently encode graph information for such blocks, we

propose to combine the graphs obtained from neighboring blocks

and then the combined graph is encoded using the AEC encoder.

4.2. Template Adaptive GBT (TA-GBT)

In this section, we propose a ﬁxed set of GBTs derived from a set

of graph templates considering the inter-predicted residual signal

characteristics discussed in Section 3. The main observation we

exploit in our design is that sharp transitions (i.e., most of the en-

ergy) appear around the corners of inter-predicted residual blocks.

This is mainly due to mismatched regions (i.e., occlusions) in inter-

prediction. The basic building blocks of the proposed graph template

construction are as follows:

1. We choose a base graph that is a uniformly weighted graph,

uni

where two examples are shown in Fig. 2. In this work, we

employ nearest-neighbor image model, so 4-connected grid

graph is used (see Fig. 2(a)).

2. By adjusting a subset of links’ weights in G

uni

, K differ-

ent graphs are constructed. These different graphs are called

graph templates {G

}

k=1

which deﬁne GBTs.

3. The statistical properties of inter-predicted residual blocks

can be captured by reducing the weights of links in G

uni

con-

necting pixels around the corners of a transform block. Par-

ticularly in this work, K = 16 templates are generated by

repeating different combinations of a rectangular pattern to

denote weak links around the corners of the graph, G

uni

4. For a selected set of graph templates, the associated GBTs are

constructed as discussed in Section 2.

(a) Inter-predicted residual block (b) Associated graph

Fig. 5: An example of edge adaptive graph construction based on a

residual block signal. The graph’s weak links correspond to sharp

transitions (i.e., edges) in the residual block.

Fig. 6: Graph templates for 8×8 blocks with index {1,2,3,...,15,16}.

Fig. 6 shows ﬁve of the sixteen graph templates designed for 8 × 8

transform blocks which lead to 16 different GBTs. Similarly, we

also generate 16 transforms for 4 × 4 residual blocks. Note that the

ﬁrst template corresponds to traditional 2-D DCT [9].

In order to adaptively select the best transform, we introduce a

graph Laplacian based quadratic cost which measures residual signal

variation on a given graph. Formally, for a given residual block sig-

nal d we select the transform whose associated graph representation

) solves the following optimization problem,

minimize

L(G

)d = d

TΛT

d = a

Λa =

i=1

(3)

where L(G

) is the combinatorial graph Laplacian of graph G

, a is

the vector of transform coefﬁcients, N is the number of samples in

the residual block, λ

denotes the eigenvalues of the graph Laplacian

in increasing order (i.e., λ

≥ λ

i+1

for i ∈ {1, ..., N − 1}) and a

the transform coefﬁcient associated with λ

. This criterion is a way

of measuring energy compaction, so that the larger λ is, the larger

penalty for its transform coefﬁcients are. Since the ﬁrst eigenvalue,

, is zero [7], then

i=1

i=2

. (4)

which induces no penalty for transform coefﬁcient a

(i.e., DC com-

ponent).

5. RESULTS

In this section, we compare the rate-distortion (RD) performance of

the proposed transforms by benchmarking against DCT and KLT. In

our simulations, we generate residual block signals for ﬁve test se-

quences, Foreman, Mobile, City, Harbour and Soccer, using HEVC

(HM-14) encoder where transform units are ﬁxed to either 4 × 4

or 8 × 8. We test the performance of different transforms on inter-

predicted residual blocks only. After transforming residual blocks,

the transform coefﬁcients are uniformly quantized and then encoded

using a symbol grouping-based arithmetic entropy encoder called

AGP, which uses an amplitude group partition technique to efﬁ-

ciently encode image transform coefﬁcients [11]. The AGP encoder

allows us to fairly compare the rate-distortion performance of differ-

ent transforms since AGP can ﬂexibly learn and exploit amplitude

Table 1: Percentage reduction in bitrate (bits/pixel) with respect to average bitrate obtained using DCT.

PSNR Transform

4 × 4 block transform 8 × 8 block transform

352×288 704×576

Average

352×288 704×576

Average

(dB) Foreman Mobile City Harbour Soccer Foreman Mobile City Harbour Soccer

EA-GBT(RO) -90.44 18.40 3.98 -0.30 7.28 7.63 N/A N/A N/A N/A N/A N/A

EA-GBT -423.02 -21.46 -99.45 -97.66 -66.38 -67.26 -159.07 13.54 -5.71 -12.64 9.18 0.93

TA-GBT 0.77 11.76 6.38 4.46 7.67 7.68 0.41 7.11 7.69 3.98 8.55 6.77

KLT 2.75 3.32 7.40 2.35 5.42 4.19 7.24 2.29 7.86 1.77 7.13 4.02

EA-GBT(RO) 7.19 15.13 18.43 12.57 17.61 15.17 N/A N/A N/A N/A N/A N/A

EA-GBT -62.04 -14.37 -36.62 -46.78 -25.45 -30.25 6.99 8.63 12.84 5.75 15.01 9.97

TA-GBT 7.61 8.88 5.83 6.02 7.09 6.23 8.30 4.65 3.80 4.35 4.18 4.20

KLT -0.20 1.49 5.38 5.74 4.13 3.40 1.34 1.21 4.31 3.73 3.90 2.65

EA-GBT(RO) 20.44 10.76 13.79 10.95 11.93 12.61 N/A N/A N/A N/A N/A N/A

EA-GBT -15.31 -12.76 -23.49 -30.20 -18.79 -19.58 21.26 4.89 7.03 4.88 6.63 7.42

TA-GBT 7.06 6.76 4.58 0.17 4.97 4.77 4.19 3.03 1.11 2.17 1.53 1.67

KLT 2.01 0.58 3.85 4.63 3.28 2.66 -0.53 0.60 3.30 4.68 3.77 2.24

0 0.2 0.4 0.6 0.8 1 1.2 1.4

Average PSNR

BPP

EA-GBT(RO)

EA-GBT

TA-GBT

KLT

DCT

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1

Average PSNR

BPP

EA-GBT

TA-GBT

KLT

DCT

Fig. 7: Average PSNR vs. BPP results (left) for 4×4 blocks and (right) for 8×8 blocks. EA-GBT(RO) corresponds to the method (only

applied to 4 × 4 blocks) that reduces signaling overhead of EA-GBT by combining graph information at neighboring blocks.

distribution of transform coefﬁcients. For ordering of the quantized

coefﬁcients, we employ zig-zag scanning for DCT coefﬁcients, and

descending and ascending order of eigenvalues are used for KLT and

GBTs coefﬁcients, respectively. To send transform signaling infor-

mation for EA-GBT, we use the arithmetic edge codec (AEC) [10]

to efﬁciently code graph information. To further reduce the over-

head of graph coding for 4 × 4 blocks (EA-GBT(RO)), we combine

the graphs obtained from neighboring blocks and the resulting larger

graph is encoded using AEC. For TA-GBT, the transform indexes

are signaled as the side information. After decoding the quantized

transform coefﬁcients using AGP decoder, we reconstruct the video

blocks and measure PSNR with respect to the original video blocks.

The average RD performances of different transforms are pre-

sented in Fig. 7 in terms of PSNR and total bits spent per-pixel

(BPP) for encoding quantized transform coefﬁcients, motion vectors

and transform signaling overheads. More comprehensive results are

available in Table 1 where we show percent bit reductions for each

video sequence gained by using GBTs and KLT at different PSNR

values (i.e., 32, 34 and 36 dBs) with respect to using DCT. Aver-

age percent reductions (corresponding to Fig. 7) are also given in

Table 1. Note that, positive values in the table means that the better

RD performance is achieved compared to using DCT. According to

these results:

• For 4 × 4 blocks, RD performance of EA-GBT is the worst

among all transforms due to the excessive graph signaling over-

head. However, the signaling overhead of EA-GBT is signif-

icantly reduced by combining the graph information of neigh-

boring blocks (see EA-GBT(RO) in Table 1 and in Fig. 7).

• EA-GBT(RO) and EA-GBT outperform all other transforms at

high-rate coding of 4 × 4 and 8 × 8 blocks, respectively. On the

other hand, TA-GBT provides a reasonable coding gain for both

low-rate and high-rate coding with respect to DCT.

6. CONCLUSIONS

In this paper, we have proposed two novel transforms, EA-GBT and

TA-GBT, for inter-predicted residual block signals, and their rate-

distortion (RD) performance is compared against traditional DCT

and KLT. The inspection of the experimental results lead us to fol-

lowing conclusions:

• Proposed EA-GBT provides 9.9% coding gain at 34dB PSNR

with respect to DCT for 8 × 8 residual blocks. For 4 × 4 blocks,

15.2% gain can be achieved using EA-GBT. However, at low

bitrates corresponding to 30-32dB PSNR, the graph signal-

ing overhead exceeds the bit reduction gained using EA-GBT.

Therefore, we propose to use TA-GBT for coding at low bitrates.

• Proposed TA-GBT nicely captures the characteristics of 4 × 4

residual blocks with low transform signaling overhead. At 34dB

PSNR, it provides 6.2% bitrate reduction with respect to DCT on

average. For 8 × 8 blocks, the reduction is relatively less, that

is 4.2%. Using more graph templates can improve the coding

gain, since more different signal characteristics can be captured

in 8 × 8 or larger blocks.

• For 4 × 4 blocks, it is inefﬁcient to directly send graphs as the

side-information. By exploiting the graph information from the

neighboring blocks, we show that the signaling overhead can be

signiﬁcantly reduced.

7. REFERENCES

[1] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand,

“Overview of the High Efﬁciency Video Coding (HEVC) Stan-

dard,” IEEE Trans. Circuits Syst. Video Technol., vol. 22,

no. 12, pp. 1649–1668, Dec. 2012.

[2] G. Shen, W.-S. Kim, S. Narang, A. Ortega, J. Lee, and H. Wey,

“Edge-adaptive transforms for efﬁcient depth map coding,” in

Picture Coding Symposium (PCS), 2010, Dec 2010, pp. 566–

569.

[3] W. Hu, G. Cheung, A. Ortega, and O. Au, “Multi-resolution

graph fourier transform for compression of piecewise smooth

images,” IEEE Transactions on Image Processing, vol. PP,

no. 99, pp. 1–1, 2014.

[4] D. Liu and M. Flierl, “Motion-adaptive transforms based on

the laplacian of vertex-weighted graphs,” in Data Compression

Conference (DCC), 2014, March 2014, pp. 53–62.

[5] S. Takamura and A. Shimizu, “On intra coding using mode

dependent 2D-KLT,” in Proc. 30th Picture Coding Symp., San

Jose, CA, Dec. 2013, pp. 137–140.

[6] J. Han, A. Saxena, V. Melkote, and K. Rose, “Jointly opti-

mized spatial prediction and block transform for video and im-

age coding,” IEEE Trans. Image Process., vol. 21, no. 4, pp.

1874–1884, Apr. 2012.

[7] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and

P. Vandergheynst, “The emerging ﬁeld of signal processing on

graphs,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98,

May 2013.

[8] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing

on graphs,” IEEE Trans. Signal Process., vol. 61, no. 7, pp.

1644–1656, Apr. 2013.

[9] C. Zhang and D. Florencio, “Analyzing the optimality of pre-

dictive transform coding using graph-based models,” IEEE

Signal Process. Lett., vol. 20, no. 1, pp. 106–109, 2013.

[10] I. Daribo, D. Florencio, and G. Cheung, “Arbitrarily shaped

motion prediction for depth video compression using arith-

metic edge coding,” IEEE Transactions on Image Processing,

vol. 23, no. 11, pp. 4696–4708, Nov 2014.

[11] A. Said and W. A. Pearlman, “Low-complexity waveform cod-

ing via alphabet and sample-set partitioning,” in SPIE Visual

Communications and Image Processing, 1997, pp. 25–37.

A Robust Image Encrypted Watermarking Technique for Neurodegenerative Disorder Diagnosis and Its Applications

Article

Full-text available

Sep 2021

The use of Internet technology has led to the availability of different multimedia data in various formats. The unapproved customers misuse multimedia information by conveying them on various web objections to acquire cash deceptively without the first copyright holder’s intervention. Due to the rise in cases of COVID-19, lots of patient information are leaked without their knowledge, so an intelligent technique is required to protect the integrity of patient data by placing an invisible signal known as a watermark on the medical images. In this paper, a new method of watermarking is proposed on both standard and medical images. The paper addresses the use of digital rights management in medical field applications such as embedding the watermark in medical images related to neurodegenerative disorders, lung disorders, and heart issues. The various quality parameters are used to figure out the evaluation of the developed method. In addition, the testing of the watermarking scheme is done by applying various signal processing attacks.

Robust Audio Watermarking Using Graph-based Transform and Singular Value Decomposition

Conference Paper

Full-text available

Dec 2020

Detection and Locating Cyber and Physical Stresses in Smart Grids using Graph Signal Processing

Preprint

Jun 2020

Smart grids are large and complex cyber physical infrastructures that require real-time monitoring for ensuring the security and reliability of the system. Monitoring the smart grid involves analyzing continuous data-stream from various measurement devices deployed throughout the system, which are topologically distributed and structurally interrelated. In this paper, graph signal processing (GSP) has been used to represent and analyze the power grid measurement data. It is shown that GSP can enable various analyses for the power grid's structured data and dynamics of its interconnected components. Particularly, the effects of various cyber and physical stresses in the power grid are evaluated and discussed both in the vertex and the graph-frequency domains of the signals. Several techniques for detecting and locating cyber and physical stresses based on GSP techniques have been presented and their performances have been evaluated and compared. The presented study shows that GSP can be a promising approach for analyzing the power grid's data.

Graph Based Cross-Channel Transform for Color Image Compression

Article

Nov 2023

Adaptive transform coding is gaining more and more attention for better mining of image content over fixed transforms such as discrete cosine transform (DCT). As a special case, graph transform learning establishes a novel paradigm for the graph-based transforms. However, there still exists a challenge for graph transform learning-based image codecs design on natural image compression, and graph representation cannot describe regular image samples well over graph-structured data. Therefore, in this paper, we propose a cross-channel graph-based transform (CCGBT) for natural color image compression. We observe that neighboring pixels having similar intensities should have similar values in the chroma channels, which means that the prominent structure of the luminance channel is related to the contours of the chrominance channels. A collaborative design of the learned graphs and their corresponding distinctive transforms lies in the assumption that a sufficiently small block can be considered smooth, meanwhile, guaranteeing the compression of the luma and chroma signals at the cost of a small overhead for coding the description of the designed luma graph. In addition, a color image compression framework based on the CCGBT is designed for comparing DCT on the classic JPEG codec. The proposed method benefits from its flexible transform block design on arbitrary sizes to exploit image content better than the fixed transform. The experimental results show that the unified graph-based transform outperforms conventional DCT, while close to discrete wavelet transform on JPEG2000 at high bit-rates.

A Graph Signal Processing Framework for Detecting and Locating Cyber and Physical Stresses in Smart Grids

Article

Sep 2022

Monitoring the smart grid involves analyzing continuous data-stream from various measurement devices deployed throughout the system, which are topologically distributed and structurally interrelated. In this paper, a graph signal processing (GSP) framework is used to represent and analyze the inter-related smart grid measurement data for security and reliability analyses. The effects of various cyber and physical stresses in the system are evaluated in different GSP domains including vertex domain, graph-frequency domain, and the joint vertex-frequency domain. Two novel techniques based on vertex-frequency energy distribution, and the local smoothness of graph signals are proposed and their performance have been evaluated for detecting and locating various cyber and physical stresses. Based on the presented analyses, the proposed techniques show promising performance for detecting sophisticated stresses with no sharp changes at the onset, for detecting abrupt load changes, and also for locating stresses.

Graph Learning

Chapter

Aug 2021

This chapter reviews well‐established solutions to the problem of graph learning that adopt a statistical or physical perspective. The graph learning problem may consist of finding the optimal weights of the edges such that the resulting graph‐based transforms, having been adapted to the actual image structure, may lead to efficient transform coding of the image. The chapter examines a series of recent GSP‐based approaches and shows how signal processing tools and concepts can be utilized to provide novel solutions to the graph learning problem. The smoothness property of the graph signal is associated with a multivariate Gaussian distribution, which also underlies the idea of classical approaches for learning graphical models, such as the graphical Lasso. Image processing can benefit significantly from graph learning technique. The chapter discusses some general directions for future work by focusing more on graph inference for image processing applications.

Graph Spectral Image and Video Compression

Chapter

Aug 2021

This chapter presents methods for building graph Fourier transforms (GFTs) for image and video compression. A key insight is that classical transforms, such as the discrete sine/cosine transform (DCT) or the Karhunen–Loeve transform (KLT), can be interpreted from a graph perspective. The chapter considers two sets of techniques for designing graphs, from which the associated GFTs are derived: Graph learning oriented GFT (GL‐GFT), and Block‐adaptive GFT. The graph spectral approaches aim to find graph Laplacian matrices, which denote the inverse covariances for the models of interest. The chapter discusses more specific 1D line models, with rigorous derivations of two separate Gaussian Markov random fields for intra‐ and inter‐predicted blocks. The experimental results demonstrated that GL‐GFTs can provide considerable coding gains with respect to standard transform coding schemes using/DCT. In comparison with the KLTs obtained from sample covariances, GL‐GFTs are more robust and provide better generalization.

Perceptually Inspired Weighted MSE Optimization Using Irregularity-Aware Graph Fourier Transform

Conference Paper

Oct 2020

Graph-Based Transforms for Video Coding

Article

Sep 2020
IEEE T IMAGE PROCESS

In many state-of-the-art compression systems, signal transformation is an integral part of the encoding and decoding process, where transforms provide compact representations for the signals of interest. This paper introduces a class of transforms called graph-based transforms (GBTs) for video compression, and proposes two different techniques to design GBTs. In the first technique, we formulate an optimization problem to learn graphs from data and provide solutions for optimal separable and nonseparable GBT designs, called GL-GBTs. The optimality of the proposed GL-GBTs is also theoretically analyzed based on Gaussian-Markov random field (GMRF) models for intra and inter predicted block signals. The second technique develops edge-adaptive GBTs (EA-GBTs) in order to flexibly adapt transforms to block signals with image edges (discontinuities). The advantages of EA-GBTs are both theoretically and empirically demonstrated. Our experimental results show that the proposed transforms can significantly outperform the traditional Karhunen-Loeve transform (KLT).

Parametric Graph-Based Separable Transforms For Video Coding

Conference Paper

Oct 2020

In many video coding systems, separable transforms (such as two-dimensional DCT-2) have been used to code block residual signals obtained after prediction. This paper proposes a parametric approach to build graph-based separable transforms (GBSTs) for video coding. Specifically, a GBST is derived from a pair of line graphs, whose weights are determined based on two non-negative parameters. As certain choices of those parameters correspond to the discrete sine and cosine transform types used in recent video coding standards (including DCT-2, DST-7 and DCT-8), this paper further optimizes these graph parameters to better capture residual block statistics and improve video coding efficiency. The proposed GBSTs are tested on the Versatile Video Coding (VVC) reference software, and the experimental results show that about 0.4% average coding gain is achieved over the existing set of separable transforms constructed based on DCT-2, DST-7 and DCT-8 in VVC.

Multiresolution Graph Fourier Transform for Compression of Piecewise Smooth Images

Article

Full-text available

Dec 2014

Piecewise smooth (PWS) images (e.g., depth maps or animation images) contain unique signal characteristics such as sharp object boundaries and slowly-varying interior surfaces. Leveraging on recent advances in graph signal processing, in this paper we propose to compress PWS images using suitable Graph Fourier Transforms (GFT) to minimize the total signal representation cost of each pixel block, considering both the sparsity of the signal's transform coefficients and the compactness of transform description. Unlike fixed transforms such as the Discrete Cosine Transform (DCT), we can adapt GFT to a particular class of pixel blocks. In particular, we select one among a defined search space of GFTs to minimize total representation cost via our proposed algorithms, leveraging on graph optimization techniques such as spectral clustering and minimum graph cuts. Further, for practical implementation of GFT we introduce two techniques to reduce computation complexity. First, at the encoder we low-pass filter and down-sample a high-resolution (HR) pixel block to obtain a low-resolution (LR) one, so that a LR-GFT can be employed. At the decoder, up-sampling and interpolation are performed adaptively along HR boundaries coded using arithmetic edge coding (AEC), so that sharp object boundaries can be well preserved. Second, instead of computing GFT from a graph in real-time via eigen-decomposition, the most popular LR-GFTs are pre-computed and stored in a table for lookup during encoding and decoding. Using depth maps and computer-graphics images as examples of PWS images, experimental results show that our proposed multi-resolution (MR)-GFT scheme outperforms H.264 intra by 6:8 dB on average in PSNR at the same bit rate.

Discrete Signal Processing on Graphs

Article

Full-text available

Oct 2012

In social settings, individuals interact through webs of relationships. Each individual is a node in a complex network (or graph) of interdependencies and generates data, lots of data. We label the data by its source, or formally stated, we index the data by the nodes of the graph. The resulting signals (data indexed by the nodes) are far removed from time or image signals indexed by well ordered time samples or pixels. DSP, discrete signal processing, provides a comprehensive, elegant, and efficient methodology to describe, represent, transform, analyze, process, or synthesize these well ordered time or image signals. This paper extends to signals on graphs DSP and its basic tenets, including filters, convolution, z-transform, impulse response, spectral representation, Fourier transform, frequency response, and illustrates DSP on graphs by classifying blogs, linear predicting and compressing data from irregularly located weather stations, or predicting behavior of customers of a mobile service provider.

Edge-adaptive transforms for efficient depth map coding

Conference Paper

Full-text available

Dec 2010

In this work a new set of edge-adaptive transforms (EATs) is presented as an alternative to the standard DCTs used in image and video coding applications. These transforms avoid filtering across edges in each image block, thus, they avoid creating large high frequency coefficients. These transforms are then combined with the DCT in H.264/AVC and a transform mode selection algorithm is used to choose between DCT and EAT in an RD-optimized manner. These transforms are applied to coding depth maps used for view synthesis in a multi-view video coding system, and provides up to 29% bit rate reduction for a fixed quality in the synthesized views.

Jointly Optimized Spatial Prediction and Block Transform for Video and Image Coding

Article

Full-text available

Sep 2011
IEEE T IMAGE PROCESS

This paper proposes a novel approach to jointly optimize spatial prediction and the choice of the subsequent transform in video and image compression. Under the assumption of a separable first-order Gauss-Markov model for the image signal, it is shown that the optimal Karhunen-Loeve Transform, given available partial boundary information, is well approximated by a close relative of the discrete sine transform (DST), with basis vectors that tend to vanish at the known boundary and maximize energy at the unknown boundary. The overall intraframe coding scheme thus switches between this variant of the DST named asymmetric DST (ADST), and traditional discrete cosine transform (DCT), depending on prediction direction and boundary information. The ADST is first compared with DCT in terms of coding gain under ideal model conditions and is demonstrated to provide significantly improved compression efficiency. The proposed adaptive prediction and transform scheme is then implemented within the H.264/AVC intra-mode framework and is experimentally shown to significantly outperform the standard intra coding mode. As an added benefit, it achieves substantial reduction in blocking artifacts due to the fact that the transform now adapts to the statistics of block edges. An integer version of this ADST is also proposed.

Low-complexity waveform coding via alphabet and sample-set partitioning

Conference Paper

Full-text available

Mar 1997
Proceedings of SPIE

We propose a new low-complexity entropy-coding method to be used for coding waveform signals. It is based on the combination of two schemes: (1) an alphabet partitioning method to reduce the complexity of the entropy-coding process; (2) a new recursive set partitioning entropy-coding process that achieves rates smaller than first order entropy even with fast Huffman adaptive codecs. Numerical results with its application for lossy and loss-less image compression show the efficacy of the new method, comparable to the best known methods

On intra coding using mode dependent 2D-KLT

Conference Paper

Dec 2013

H.265/HEVC intra coding scheme allows up to 35 prediction modes. In this paper, we propose intra-mode dependent residual transform using 2D-KLT for 4×4, 8×8, 16×16 and 32×32 blocks. Unlike H.265/HEVC and former standards, the transform is not separable and has higher degree of freedom. It does not require coefficient scanning process. Preliminary results demonstrate BD-rate gain of up to 2.30% (average except screen contents), 2.35% (all average) and 12.67% (maximum) compared to HM10.0 anchor.

Arbitrarily Shaped Motion Prediction for Depth Video Compression Using Arithmetic Edge Coding

Article

Aug 2014

Depth image compression is important for compact representation of 3D visual data in "texture-plus-depth" format, where texture and depth maps from one or more viewpoints are encoded and transmitted. A decoder can then synthesize a freely chosen virtual view via depth-imagebased rendering (DIBR) using nearby coded texture and depth maps as reference. Further, depth information can be used in other image processing applications beyond view synthesis, such as object identification, segmentation, etc. In this paper, we leverage on the observation that "neighboring pixels of similar depth have similar motion" to efficiently encode depth video. Specifically, we divide a depth block containing two zones of distinct values (e.g., foreground and background) into two arbitrarily shaped regions (subblocks) along the dividing boundary before performing separate motion prediction (MP). While such arbitrarily shaped sub-block MP can lead to very small prediction residuals (resulting in few bits required for residual coding), it incurs an overhead to transmit the dividing boundaries for subblock identification at decoder. To minimize this overhead, we first devise a scheme called arithmetic edge coding (AEC) to efficiently code boundaries that divide blocks into subblocks. Specifically, we propose to incorporate the boundary geometrical correlation in an adaptive arithmetic coder in the form of a statistical model. Then, we propose two optimization procedures to further improve the edge coding performance of AEC for a given depth image. The first procedure operates within a code block, and allows lossy compression of the detected block boundary to lower the cost of AEC, with an option to augment boundary depth pixel values matching the new boundary, given the augmented pixels do not adversely affect synthesized view distortion. The second procedure operates across code blocks, and systematically identifies blocks along an object contour that should be coded using sub-block MP via a rate-distortion optimized trellis. Experimental results show an average overall bitrate reduction of up to 33% over classical H.264/AVC.

Motion-Adaptive Transforms Based on Vertex-Weighted Graphs

Conference Paper

Mar 2013

Motion information in image sequences connects pixels that are highly correlated. In this paper, we consider vertex-weighted graphs that are formed by motion vector information. The vertex weights are defined by scale factors which are introduced to improve the energy compaction of motion-adaptive transforms. Further, we relate the vertex-weighted graph to a subspace constraint of the transform. Finally, we propose a subspace-constrained transform (SCT) that achieves optimal energy compaction for the given constraint. The subspace constraint is derived from the underlying motion information only and requires no additional information. Experimental results on energy compaction confirm that the motion-adaptive SCT outperforms motion-compensated orthogonal transforms while approaching the theoretical performance of the Karhunen Loeve Transform (KLT) along given motion trajectories.

Analyzing the Optimality of Predictive Transform Coding Using Graph-Based Models

Article

Jan 2013

In this letter, we provide a theoretical analysis of optimal predictive transform coding based on the Gaussian Markov random field (GMRF) model. It is shown that the eigen-analysis of the precision matrix of the GMRF model is optimal in decorrelating the signal. The resulting graph transform degenerates to the well-known 2-D discrete cosine transform (DCT) for a particular 2-D first order GMRF, although it is not a unique optimal solution. Furthermore, we present an optimal scheme to perform predictive transform coding based on conditional probabilities of a GMRF model. Such an analysis can be applied to both motion prediction and intra-frame predictive coding, and may lead to improvements in coding efficiency in the future.

Overview of the High Efficiency Video Coding (HEVC) standard

Article

Dec 2012

High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

Graph-based Transforms for Inter Predicted Video Coding

Abstract and Figures

Recommended publications

Hybrid video coding using variable size block transforms.

Parametric Graph-Based Separable Transforms For Video Coding

Parametric Graph-based Separable Transforms for Video Coding

Graph-based Transforms for Video Coding

GTT: Graph Template Transforms with Applications to Image Coding

Graph-Based Transforms for Video Coding

Graph Spectral Image and Video Compression