Algorithm performance at different thresholds.

Algorithm performance at different thresholds.

Source publication
Article
Full-text available
The newest video coding standard, the versatile video coding standard (VVC/H.266), came into effect in November 2020. Different from the previous generation standard—high-efficiency video coding (HEVC/H.265)—VVC adopts a more flexible block division structure, the quad-tree with nested multi-type tree (QTMT) structure, which improves its coding per...

Contexts in source publication

Context 1
... order to select the most suitable threshold, we used the overall algorithm with different threshold formulas to perform coding tests on the VVC standard test sequence ParkRunning3 under the coding environments of QP = 22, 27, 32, and 37. The algorithm performance is listed under different thresholds in Table 2. The performance of the algorithm was measured using ∆T and the Bjøntegaard Delta Bit Rate (BDBR). ...
Context 2
... T base (QP i ) and T prop (QP i ) indicate the encoding time spent by the original algorithm and the algorithm proposed in this paper under QP = 22, 27, 32, and 37, respectively. According to the results shown in Table 2, we finally set (a, b) to be (0.8, 0.1), so that the proposed algorithm can achieve the best performance. The final threshold formula is: ...

Similar publications

Conference Paper
Full-text available
The paper presents the results of subjective tests of the general evaluation of the quality of video signals subjected to the encoding-H.264 / AVC (Audio Video Coding), H.265 / HEVC-(High-Efficiency Video Coding) and developed by Google VP09. The assessment was made at home, that is, when young people potentially watch movies on their laptops. The...

Citations

... In the literature [33], a state-of-the-art CNN model is used to implement CU partitioning by two-checking the correlated texture and limiting the depth, and this algorithm has a very high performance in video coding, with an average coding time reduction of 42.34% and a BDBR increase of only 0.71%. In the literature [34], although the TS is 47.82%, our model is lighter than their model, and the prediction size is larger than their model. More importantly, our BDBR loss is significantly less than their model. ...
Article
Full-text available
Compared with the previous generation of High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC) introduces a quadtree and multi-type tree (QTMT) partition structure with nested multi-class trees so that the coding unit (CU) partition can better match the video texture features. This partition structure makes the compression efficiency of VVC significantly improved, but the computational complexity is also significantly increased, resulting in an increase in encoding time. Therefore, we propose a fast CU partition decision algorithm based on DenseNet network and decision tree (DT) classifier to reduce the coding complexity of VVC and save more coding time. We extract spatial feature vectors based on the DenseNet network model. Spatial feature vectors are constructed by predicting the boundary probabilities of 4 × 4 blocks in 64 × 64 coding units. Then, using the spatial features as the input of the DT classifier, through the classification function of the DT classifier model, the top N division modes with higher prediction probability are selected, and other division modes are skipped to reduce the computational complexity. Finally, the optimal partition mode is selected by comparing the RD cost. Our proposed algorithm achieves 47.6% encoding time savings on VTM10.0, while BDBR only increases by 0.91%.
... Yoon et al. [15] proposed an activity-based fast block partitioning decision method using the information of the current CU, minimizing the dependence on the QP and utilizing the gradient calculation used in the adaptive loop filter (ALF). Wang et al. [16] designed a multistage early termination CNN (MET-CNN) model to predict the partition information of 32 × 32-sized CUs. In addition, they proposed the concept of stage grid maps by dividing the entire partition into four stages to represent the structured output and consequently predict all partition information of the 32 × 32-sized CUs and their sub-CUs as the model outputs. ...
Article
Full-text available
Versatile Video Coding (VVC), the state-of-the-art video coding standard, was developed by the Joint Video Experts Team (JVET) of ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) in 2020. Although VVC can provide powerful coding performance, it requires tremendous computational complexity to determine the optimal mode decision during the encoding process. In particular, VVC adopted the bi-prediction with CU-level weight (BCW) as one of the new tools, which enhanced the coding efficiency of conventional bi-prediction by assigning different weights to the two prediction blocks in the process of inter prediction. In this study, we investigate the statistical characteristics of input features that exhibit a correlation with the BCW and define four useful types of categories to facilitate the inter prediction of VVC. With the investigated input features, a lightweight neural network with multilayer perceptron (MLP) architecture is designed to provide high accuracy and low complexity. We propose a fast BCW mode decision method with a lightweight MLP to reduce the computational complexity of the weighted multiple bi-prediction in the VVC encoder. The experimental results show that the proposed method significantly reduced the BCW encoding complexity by up to 33% with unnoticeable coding loss, compared to the VVC test model (VTM) under the random-access (RA) configuration.
... The fast algorithms of H.266/VVC intra coding have been explored to solve the problem of the high computational requirement of H.266/VVC. The methods can be roughly categorized into probability-based [10][11][12], learning-based [13][14][15][16][17][18][19][20], probabilityand learning-based [21,22], texture-based [23][24][25][26], gradient-based [27,28], and texture-and gradient-based [29][30][31] techniques. The related work on the fast intra coding of H.266/VVC is discussed below. ...
... Wu et al. [16] trained two support vector machine classifiers to predict the split or non-split, and horizontal or vertical split for CUs of different sizes. Wang et al. [17] devised a multi-stage early termination convolutional neural network model that can predict all the partition information of a 32 × 32 CU and its sub-CUs. Zouidi et al. [18] proposed an intra mode decision to skip unlikely intra prediction modes by using a multi-task learning convolutional neural network. ...
Article
Full-text available
The latest international video coding standard, H.266/Versatile Video Coding (VVC), supports high-definition videos, with resolutions from 4 K to 8 K or even larger. It offers a higher compression ratio than its predecessor, H.265/High Efficiency Video Coding (HEVC). In addition to the quadtree partition structure of H.265/HEVC, the nested multi-type tree (MTT) structure of H.266/VVC provides more diverse splits through binary and ternary trees. It also includes many new coding tools, which tremendously increases the encoding complexity. This paper proposes a fast intra coding algorithm for H.266/VVC based on visual perception analysis. The algorithm applies the factor of average background luminance for just-noticeable-distortion to identify the visually distinguishable (VD) pixels within a coding unit (CU). We propose calculating the variances of the numbers of VD pixels in various MTT splits of a CU. Intra sub-partitions and matrix weighted intra prediction are turned off conditionally based on the variance of the four variances for MTT splits and a thresholding criterion. The fast horizontal/vertical splitting decisions for binary and ternary trees are proposed by utilizing random forest classifiers of machine learning techniques, which use the information of VD pixels and the quantization parameter. Experimental results show that the proposed algorithm achieves around 47.26% encoding time reduction with a Bjøntegaard Delta Bitrate (BDBR) of 1.535% on average under the All Intra configuration. Overall, this algorithm can significantly speed up H.266/VVC intra coding and outperform previous studies.
... Experiments showed that this method is more effective than deep learning. Wang et al. [18] proposed a CNN model with multi-level fetch termination that can predict all CU partition patterns of the size 32 × 32. Ni et al. [19] devised a partition strategy for binary and ternary trees by calculating the gradient and applying regression analysis. ...
Article
Full-text available
The Joint Video Exploration Team (JVET) has created the Versatile Video Coding Standard (VVC/H.266), the most up-to-date video coding standard, offering a broad selection of coding tools. The maturity of commercial VVC codecs can significantly reduce costs and improve coding efficiency. However, the latest video coding standards have introduced binomial and trinomial tree partitioning methods, which cause the coding units (CUs) to have various shapes, increasing the complexity of coding. This article proposes a technique to simplify VVC intra prediction through the use of gradient analysis and a multi-feature fusion CNN. The gradient of CUs is computed by employing the Sobel operator, the calculation results are used for predecision-making. Further decisions can be made by CNN for coding units that cannot be judged whether they should be segmented or not. We calculate the standard deviation (SD) and the initial depth as the input features of the CNN. To implement this method, the initial depth can be determined by constructing a segmented depth prediction dictionary. For the initial segmentation depth of the coding unit, regardless of its shape, it can also be determined by consulting the dictionary. The algorithm can determine whether to split CUs of varying sizes, decreasing the complexity of the CU division process and making VVC more practical. Experimental results demonstrate that the proposed algorithm can reduce encoding time by 36.56% with a minimal increase of 1.06% Bjøntegaard delta bit rate (BD-BR) compared to the original algorithm.
Article
Full-text available
This paper presents a genetic approach for optimizing intra coding in H.266/VVC. The proposed algorithm efficiently selects coding tools and Multi-Type Tree (MTT) partitions to achieve a balance between encoding time and video quality. The fitness evaluation function, which combines perceptual metrics and coding efficiency metrics, is used to assess the quality of each candidate solution. The results demonstrate a significant reduction in encoding time without compromising video quality. The proposed algorithm selects coding tools from a set of available tools in H.266/VVC. These tools include intra prediction modes, transform units, quantization parameters, and entropy coding modes. The MTT partitioning scheme includes four types of partitions: quadtree, binary tree, ternary tree, and quad-binary tree. Perceptual metrics are used to evaluate the visual quality of the encoded video. Coding efficiency metrics are used to evaluate the coding efficiency of the encoded video. The fitness evaluation function combines perceptual metrics and coding efficiency metrics to assess the quality of each candidate solution.