Memory representation of Elias gamma and k-gamma encoding of four example codewords of effective bits of only the non-zero values. If all values are 0, only the separator bit has to be stored and the zero bit sequence of the shared prefix has length zero. Example codewords for k-gamma as well as k-gamma0 encoding can be found in the appendix. Figure 4 illustrates the memory layout of an example with 4 codewords for Elias gamma, 1-gamma, 2-gamma, and 4-gamma encoding. Elias gamma encoding-shown at the top of Figure 4-stores prefix and value of each codeword together. k-gamma encoding stores shared prefix and the values separately. While each value has its own prefix for 1-gamma encoding, two or four values share a prefix when 2-gamma or 4-gamma encoding, respectively, is used. Furthermore, each of the k values of a block starts at the same relative bit address in their memory word (e.g., v3 and v4 in words 0x00 and 0x04). In our actual implementation of k-gamma encoding, we split the memory into smaller chunks. The codewords grow forward from the start of each chunk and the shared prefixes grow backwards from the end of the chunk. 4

Source publication

Fast integer compression using SIMD instructions

Conference Paper

Full-text available

Jun 2010

We study algorithms for efficient compression and decompression of a sequence of integers on modern hardware. Our focus is on universal codes in which the codeword length is a monotonically non-decreasing function of the uncompressed integer value; such codes are widely used for compressing "small integers". In contrast to traditional integer compr...

Context 1

... bits 0x08 0x04 0x00 Figure 4 illustrates the memory layout of an example with 4 codewords for Elias gamma, 1-gamma, 2-gamma, and 4-gamma encoding. Elias gamma encoding-shown at the top of Figure 4-stores prefix and value of each code- word together. ...

View in full-text

Context 2

View in full-text

A hybrid code compression technique using bitmask and prefix encoding with enhanced dictionary selection

Conference Paper

Full-text available

Sep 2007

Memory is one of the most significant detrimental factors in increasing the cost and area of embedded systems, especially as semiconductor technology scales down. Code compres- sion techniques have been employed to reduce the memory requirement of the system without sacrificing its functional- ity. Bitmask-based code compression has been demonstrat...

HFPaC: GPU friendly height field parallel compression

Article

Full-text available

Jan 2013

In this paper, we present a novel method for fast lossy or lossless compression and decompression of regular height fields. The method is suitable for SIMD parallel implementation and thus inherently suitable for modern GPU architectures. Lossy compression is achieved by approximating the height field with a set of quadratic Bezier surfaces. In add...

LeCo: Lightweight Compression via Learning Serial Correlations

Preprint

Jun 2023

Lightweight data compression is a key technique that allows column stores to exhibit superior performance for analytical queries. Despite a comprehensive study on dictionary-based encodings to approach Shannon's entropy, few prior works have systematically exploited the serial correlation in a column for compression. In this paper, we propose LeCo (i.e., Learned Compression), a framework that uses machine learning to remove the serial redundancy in a value sequence automatically to achieve an outstanding compression ratio and decompression performance simultaneously. LeCo presents a general approach to this end, making existing (ad-hoc) algorithms such as Frame-of-Reference (FOR), Delta Encoding, and Run-Length Encoding (RLE) special cases under our framework. Our microbenchmark with three synthetic and six real-world data sets shows that a prototype of LeCo achieves a Pareto improvement on both compression ratio and random access speed over the existing solutions. When integrating LeCo into widely-used applications, we observe up to 3.9x speed up in filter-scanning a Parquet file and a 16% increase in Rocksdb's throughput.

BOUNCE: memory-efficient SIMD approach for lightweight integer compression

Article

Full-text available

May 2023
DISTRIB PARALLEL DAT

Integer compression plays an important role in columnar database systems to reduce the main memory footprint as well as to speedup query processing. To keep the additional computational effort of (de)compression as low as possible, the powerful Single Instruction Multiple Data (SIMD) extensions of modern CPUs are heavily applied. While a scalar compression algorithm usually compresses a block of N consecutive integers, the state-of-the-art SIMDified implementation scales the block size to k·N\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k \cdot N$$\end{document} with k as the number of elements which could be simultaneously processed in an SIMD register. On the one hand, this scaling SIMD approach improves the performance of (de)compression. But on the other hand, it can lead to a degradation of the memory footprint of the compressed data. Within this article, we analyze this degradation effect for various integer compression algorithms and present a novel SIMD concept to overcome that effect. The core idea of our novel SIMD concept called BOUNCE is to concurrently compress k different blocks of size N within SIMD registers, guaranteeing the same compression ratio as scalar variant. As we are going to show, our proposed SIMD idea works well on various Intel CPUs and may offer a new generalized SIMD concept to optimize further algorithms.

A fast lasso-based method for inferring higher-order interactions

Article

Full-text available

Dec 2022
PLOS COMPUT BIOL

Large-scale genotype-phenotype screens provide a wealth of data for identifying molecular alterations associated with a phenotype. Epistatic effects play an important role in such association studies. For example, siRNA perturbation screens can be used to identify combinatorial gene-silencing effects. In bacteria, epistasis has practical consequences in determining antimicrobial resistance as the genetic background of a strain plays an important role in determining resistance. Recently developed tools scale to human exome-wide screens for pairwise interactions, but none to date have included the possibility of three-way interactions. Expanding upon recent state-of-the-art methods, we make a number of improvements to the performance on large-scale data, making consideration of three-way interactions possible. We demonstrate our proposed method, Pint, on both simulated and real data sets, including antibiotic resistance testing and siRNA perturbation screens. Pint outperforms known methods in simulated data, and identifies a number of biologically plausible gene effects in both the antibiotic and siRNA models. For example, we have identified a combination of known tumour suppressor genes that is predicted (using Pint) to cause a significant increase in cell proliferation.

Partition-based SIMD Processing and its Application to Columnar Database Systems

Article

Full-text available

Dec 2022

The Single Instruction Multiple Data (SIMD) paradigm became a core principle for optimizing query processing in columnar database systems. Until now, only the instructions are considered to be efficient enough to achieve the expected speedups, while avoiding is considered almost imperative. However, the instruction offers a very flexible way to populate SIMD registers with data elements coming from non-consecutive memory locations. As we will discuss within this article, the instruction can achieve the same performance as the instruction, if applied properly. To enable the proper usage, we outline a novel access pattern allowing fine-grained, partition-based SIMD implementations. Then, we apply this partition-based SIMD processing to two representative examples from columnar database systems to experimentally demonstrate the applicability and efficiency of our new access pattern.

BOUNCE: Memory-Efficient SIMD Approach for Lightweight Integer Compression

Preprint

Full-text available

Oct 2022

Integer compression plays an important role in columnar database systems to reduce the main memory footprint as well as to speedup query processing. To keep the additional computational effort of (de)compression as low as possible, the powerful Single Instruction Multiple Data (SIMD) extensions of modern CPUs are heavily applied. While a scalar compression algorithm usually compresses a block of N consecutive integers, the state-of-the-art SIMDified implementation scales the block size to k · N with k as the number of elements which could be simultaneously processed in an SIMD register. On the one hand, this scaling SIMD approach improves the performance of (de)compression but can lead to a degradation of the compression ratio compared to the scalar variant on the other hand. Within this article, we analyze this degradation effect for various integer compression algorithms and present a novel SIMD concept to overcome that effect. The core idea of our novel SIMD concept called BOUNCE is to concurrently compress k different blocks of size N within SIMD registers, guaranteeing the same compression ratio as scalar variant. As we are going to show, our proposed SIMD idea works well on various Intel CPUs and may offer a new generalized SIMD concept to optimize further algorithms.

A Fast Lasso-Based Method for Inferring Higher-Order Interactions

Preprint

Full-text available

Dec 2021

A bstract Large-scale genotype-phenotype screens provide a wealth of data for identifying molecular alterations associated with a phenotype. Epistatic effects play an important role in such association studies. For example, siRNA perturbation screens can be used to identify combinatorial gene-silencing effects. In bacteria, epistasis has practical consequences in determining antimicrobial resistance as the genetic background of a strain plays an important role in determining resistance. Recently developed tools scale to human exome-wide screens for pairwise interactions, but none to date have included the possibility of three-way interactions. Expanding upon recent state-of-the art methods, we make a number of improvements to the performance on large-scale data, making consideration of three-way interactions possible. We demonstrate our proposed method, Pint , on both simulated and real data sets, including antibiotic resistance testing and siRNA perturbation screens. Pint outperforms known methods in simulated data, and identifies a number of biologically plausible gene effects in both the antibiotic and siRNA models. For example, we have identified a combination of known tumor suppressor genes that is predicted (using Pint ) to cause a significant increase in cell proliferation.

Mastering the NEC Vector Engine Accelerator for Analytical Query Processing

Conference Paper

Apr 2021

A Fast Lasso-Based Method for Inferring Pairwise Interactions

Preprint

Full-text available

Feb 2021

A bstract Large-scale genotype-phenotype screens provide a wealth of data for identifying molecular alternations associated with a phenotype. Epistatic effects play an important role in such association studies. For example, siRNA perturbation screens can be used to identify pairwise gene-silencing effects. In bacteria, epistasis has practical consequences in determining antimicrobial resistance as the genetic background of a strain plays an important role in determining resistance. Existing computational tools which account for epistasis do not scale to human exome-wide screens and struggle with genetically diverse bacterial species such as Pseudomonas aeruginosa . Combining earlier work in interaction detection with recent advances in integer compression, we present a method for epistatic interaction detection on sparse (human) exome-scale data, and an R implementation in the package Pint . Our method takes advantage of sparsity in the input data and recent progress in integer compression to perform lasso-penalised linear regression on all pairwise combinations of the input, estimating up to 200 million potential effects, including epistatic interactions. Hence the human exome is within the reach of our method, assuming one parameter per gene and one parameter per epistatic effect for every pair of genes. We demonstrate Pint on both simulated and real data sets, including antibiotic resistance testing and siRNA perturbation screens.

Techniques for Inverted Index Compression

Article

Full-text available

Dec 2020
ACM COMPUT SURV

The data structure at the core of large-scale search engines is the inverted index , which is essentially a collection of sorted integer sequences called inverted lists . Because of the many documents indexed by such engines and stringent performance requirements imposed by the heavy load of queries, the inverted index stores billions of integers that must be searched efficiently. In this scenario, index compression is essential because it leads to a better exploitation of the computer memory hierarchy for faster query processing and, at the same time, allows reducing the number of storage machines. The aim of this article is twofold: first, surveying the encoding algorithms suitable for inverted index compression and, second, characterizing the performance of the inverted index through experimentation.

Huffman Coding Based Encoding Techniques for Fast Distributed Deep Learning

Conference Paper

Full-text available

Dec 2020

Distributed stochastic algorithms, equipped with gradient compression techniques, such as codebook quantization, are becoming increasingly popular and considered state-of-the-art in training large deep neural network (DNN) models. However, communicating the quantized gradients in a network requires efficient encoding techniques. For this, practitioners generally use Elias encoding-based techniques without considering their computational overhead or data-volume. In this paper, based on Huffman coding, we propose several lossless encoding techniques that exploit different characteristics of the quantized gradients during distributed DNN training. Then, we show their effectiveness on 5 different DNN models across three different data-sets, and compare them with classic state-of-the-art Elias-based encoding techniques. Our results show that the proposed Huffman-based encoders (i.e., RLH, SH, and SHS) can reduce the encoded data-volume by up to 5.1×, 4.32×, and 3.8×, respectively, compared to the Elias-based encoders.

Contexts in source publication

Similar publications

Citations