Hierarchical structure in a 16-bit word of MPCBF [50]. Initially, the number of hash functions k=3, and level 1 has 8 bits which have been initialized as 0.

Source publication

Optimizing Bloom Filter: Challenges, Solutions, and Comparisons

Article

Full-text available

Apr 2018

Bloom filter (BF) has been widely used to support membership query, i.e., to judge whether a given element x is a member of a given set S or not. Recent years have seen a flourish design explosion of BF due to its characteristic of space-efficiency and the functionality of constant-time membership query. The existing reviews or surveys mainly focus...

Context 1

View in full-text

Context 2

... an element is inserted into the word, k bits must be set from 0 to 1. And whenever a bit is set from 0 to 1, an empty bit should be added into the next level and initialized as 0. This is realized by using a function popcount(i) which computes the number of ones before position i at the hierarchy level that bit i belongs to. For example, in Fig. 6, when element x is inserted into the word, the three bits in level 1 are set from 0 to 1. According to the constructing principles, three empty bits, i.e., bits 8, 9, and 10, are added into level 2. Thereafter, when another element y is inserted into the word, it is hashed into three bits 7, 4 and 2 in the level 1. Bit 7 is set from 0 ...

View in full-text

Context 3

... depicted in Fig. 6, MPCBF allocates the bits in each word as multiple levels. The basic principle to construct the hierarchical structure is shown as follows. Whenever an element is inserted into the word, k bits must be set from 0 to 1. And whenever a bit is set from 0 to 1, an empty bit should be added into the next level and initialized as 0. This is realized by using a function popcount(i) which computes the number of ones before position i at the hierarchy level that bit i belongs to. For example, in Fig. 6, when element x is inserted into the word, the three bits in level 1 are set from 0 to 1. According to the constructing principles, three empty bits, i.e., bits 8, 9, and 10, are added into level 2. Thereafter, when another element y is inserted into the word, it is hashed into three bits 7, 4 and 2 in the level 1. Bit 7 is set from 0 to 1, thus a new bit (bit 11) is added to level 2. For bit 2, the function popcount(2)=1, and check the bit at position 8+1=9 and set bit 9 from 0 to 1, then the bit 12 is added into level 3. By doing the same for bit 4, the bit 10 is set to 1 and bit 13 is added to level 3. To delete an element, the inverse operations are ...

View in full-text

Context 4

View in full-text

Participación Asociativa Percibida en Contextos Socio-Educativos: Validación de Contenido

Article

Full-text available

Apr 2019

[Spa] En las asociaciones del Tercer Sector de acción socioeducativa se han ido estableciendo lógicas, políticas y prácticas para acreditar la calidad de los servicios. Este tipo de orientaciones organizativas son similares a los procedimientos que se han llevado a cabo en los últimos años, en otros contextos educativos, y que han mermado considera...

Granulometric composition of the gravel fraction of PGS Vedeno field

Strength indicators of gravel VPGS Vedeno deposits

The chemical composition of the sand fraction from ASG Vedeno deposits

Compositions and properties of concrete based on local raw materials...

CONCRETE COMPOSITES USING GRAND-SANDY MIXTURES OF DEPOSITS OF THE CHECHEN REPUBLIC

Article

Full-text available

Aug 2019

Objectives Development of optimal recipes of concrete mixtures using local natural raw materials in the form of gravel-sand mixtures from deposits of the Chechen Republic. Method The research methods adopted in the work are based on the theoretical principles and laws of designing and optimizing polydisperse multicomponent systems, the phase and st...

BFASTDC: A Bitwise Algorithm for Mining Denial Constraints: 29th International Conference, DEXA 2018, Regensburg, Germany, September 3–6, 2018, Proceedings, Part I

Chapter

Full-text available

Aug 2018

Integrity constraints (ICs) are meant for many data management tasks. However, some types of ICs can express semantic rules that others ICs cannot, or vice versa. Denial constraints (DCs) are known to be a response to this expressiveness issue because they generalize important types of ICs, such as functional dependencies (FDs), conditional FDs, an...

USING VIDEO MATERIALS IN ELECTRONIC LEARNING COURSES

Article

Full-text available

Sep 2019

Video materials belong to the most powerful tools in educational process because they provide learners simultaneously with auditory and visual information. Thus, according to the Edgar Dale’s cone of learning they are more effective than classical classroom lectures, reading textbooks or listening to podcasts. Moreover, video as a part of a learnin...

Designing Through Value Co-creation: A Study of Actors, Practices and Possibilities

Conference Paper

Full-text available

Dec 2019

The customer-provider collaboration that was diminished during the industrial revolution is being revived to achieve higher customer satisfaction and a competitive edge. Manufacturers are now interested in co-creating value with their customers to design a customized and sustainable solution. Value co-creation is being implemented by various busine...

Bandwidth Efficient Cache Selection and Content Advertisement

Preprint

Full-text available

May 2024

Itamar Cohen

Caching is extensively used in various networking environments to optimize performance by reducing latency, bandwidth, and energy consumption. To optimize performance, caches often advertise their content using indicators, which are data structures that trade space efficiency for accuracy. However, this tradeoff introduces the risk of false indications. Existing solutions for cache content advertisement and cache selection often lead to inefficiencies, failing to adapt to dynamic network conditions. This paper introduces SALSA2, a Scalable Adaptive and Learning-based Selection and Advertisement Algorithm, which addresses these limitations through a dynamic and adaptive approach. SALSA2 accurately estimates mis-indication probabilities by considering inter-cache dependencies and dynamically adjusts the size and frequency of indicator advertisements to minimize transmission overhead while maintaining high accuracy. Our extensive simulation study, conducted using a variety of real-world cache traces, demonstrates that SALSA2 achieves up to 84% bandwidth savings compared to the state-of-the-art solution and close-to-optimal service cost in most scenarios. These results highlight SALSA2's effectiveness in enhancing cache management , making it a robust and versatile solution for modern networking challenges.

A Comprehensive Review on Secure Biometric-Based Continuous Authentication and User Profiling

Article

Full-text available

Jan 2024

Authentication systems are pivotal in fortifying security measures against unauthorized access. Yet, they often fall short of effectively combating impersonation attacks, leaving systems susceptible to exploitation. Continuous Authentication Systems (CAS) have emerged as a promising solution, offering dynamic adaptability to evolving threats. However, the existing literature lacks a thorough critical evaluation of CAS progress, hindering practical advancements in the field. This comprehensive review addresses this gap by analyzing recent advancements, emerging trends, and critical challenges in CAS design and implementation. The review reveals that while supervised learning methods, particularly score-level fusion, dominate CAS classification techniques, there remains a dearth of comparative analysis regarding the efficacy of different biometric pairings (e.g., physiological, behavioral, or multimodal). While studies predominantly assess CAS accuracy using metrics like False Rejection Rate (FRR), False Acceptance Rate (FAR), and Equal Error Rate (EER), aspects crucial to practical success, such as usability, security, and scalability, often receive inadequate attention. Moreover, the practical viability of CAS demands comprehensive implementation and evaluation using real-world data. This survey paper explores various facets of CAS, including physiological and behavioral biometrics, multimodal biometrics, context-aware techniques, and other emerging methodologies. Additionally, open issues, challenges, and proposed future directions aim to inspire further research and development in secure biometric-based continuous authentication and user profiling.

U.s.-U.k. PETs Prize Challenge: Anomaly Detection Via Privacy-Enhanced Federated Learning

Article

Full-text available

Jan 2024

Privacy Enhancing Technologies (PETs) have the potential to enable collaborative analytics without compromising privacy. This is extremely important for collaborative analytics can allow us to really extract value from the large amounts of data that are collected in domains such as healthcare, finance, and national security, among others. In order to foster innovation and move PETs from the research labs to actual deployment, the U.S. and U.K. governments partnered together in 2021 to propose the PETs prize challenge asking for privacy-enhancing solutions for two of the biggest problems facing us today: financial crime prevention and pandemic response. This article presents the Rutgers ScarletPets privacy-preserving federated learning approach to identify anomalous financial transactions in a payment network system (PNS). This approach utilizes a two-step anomaly detection methodology to solve the problem. In the first step, features are mined based on accountlevel data and labels, and then a privacy-preserving encoding scheme is used to augment these features to the data held by the PNS. In the second step, the PNS learns a highly accurate classifier from the augmented data. Our proposed approach has two major advantages: 1) there is no noteworthy drop in accuracy between the federated and the centralized setting, and 2) our approach is extremely flexible since the PNS can keep improving its model and features to build a better classifier without imposing any additional computational or privacy burden on the banks. Notably, our solution won the first prize in the US for its privacy, utility, efficiency, and flexibility.

On the Privacy of Adaptive Cuckoo Filters: Analysis and Protection

Article

Full-text available

Jan 2024
IEEE T INF FOREN SEC

As probabilistic data structures are widely adopted in computing systems, their privacy is a major issue. Recent works have shown that even though the values stored in these structures look random, information can be extracted from them in some settings. In this paper, we consider the privacy of adaptive cuckoo filters, a probabilistic data structure that implements approximate membership checking. The main novelty and benefit of these filters are that they can adapt to removing false-positives. Unfortunately, our analysis shows that adaptation can dramatically reduce the privacy of the filters, allowing an attacker to extract the set of elements stored in the filter. Indeed, in some settings, the attacker can identify 100% of the elements stored in the filter. This means that the protection of the privacy of adaptive cuckoo filters should be considered. To that end, we propose preprocessing reduction (PR), a scheme that prevents an attacker from extracting the set of elements stored in the filter at the cost of increasing the false-positive probability of the filter. In many settings, the impact on false-positives will be negligible. For example, in a case study with 32-bit universes, the increase in the false-positive probability was smaller than 8% in all the configurations tested. Interestingly, PR is applicable not only to adaptive filters but also to approximate membership check filters in general and thus can be used to protect, for example, Bloom filters.

On the Security of Quotient Filters: Attacks and Potential Countermeasures

Article

Full-text available

Jan 2024
IEEE T COMPUT

The security of probabilistic data structures is increasingly important due to their wide adoption in many computing systems and applications. In particular, the security of approximate membership check filters such as Bloom or cuckoo filters has been recently studied showing how an attacker can degrade the filter performance in some settings. In this paper, we consider for the first time the security of another popular approximate membership check filter, the Quotient Filter (QF). Our analysis and simulations show that quotient filters are vulnerable to both white and black box attackers that can cause insertion failures and degrade the filter performance very significantly. An interesting finding is that quotient filters are vulnerable to a new type of attack, not applicable to Bloom or cuckoo filters, that can degrade the speed of queries dramatically. The paper also briefly discusses and evaluates potential countermeasures to detect and protect against those attacks.

Ark Filter: A General and Space-Efficient Sketch for Network Flow Analysis

Article

Full-text available

Dec 2023
IEEE ACM T NETWORK

Sketches are widely deployed to represent network flows to support complex flow analysis. Typical sketches usually employ hash functions to map elements into a hash table or bit array. Such sketches still suffer from potential weaknesses upon throughput, flexibility, and functionality. To this end, we propose Ark filter, a novel sketch that stores the element information with either of two candidate buckets indexed by the quotient or remainder between the fingerprint and filter length. In this way, no further hash calculations are required for future queries or reallocations. We further extend the Ark filter to enable capacity elasticity and more functionalities (such as frequency estimation and top- $k$ query). Comprehensive experiments demonstrate that, compared with Cuckoo filter, Ark filter has $2.08\times$ , $1.34\times$ , and $1.68\times$ throughput of deletion, insertion, and hybrid query, respectively; compared with Quotient filter, Ark filter has $4.55\times$ , $1.74\times$ , and $22.12\times$ throughput of deletion, insertion, and hybrid query, respectively; compared with Bloom filter, Ark filter has $2.55\times$ and $2.11\times$ throughput of insertion and hybrid query, respectively.

An Extension of DNAContainer with a Small Memory Footprint

Article

Full-text available

Oct 2023

Over the past decade, DNA has emerged as a new storage medium with intriguing data volume and durability capabilities. Despite its advantages, DNA storage also has crucial limitations, such as intricate data access interfaces and restricted random accessibility. To overcome these limitations, DNAContainer has been introduced with a novel storage interface for DNA that spans a very large virtual address space on objects and allows random access to DNA at scale. In this paper, we substantially improve the first version of DNAContainer, focusing on the update capabilities of its data structures and optimizing its memory footprint. In addition, we extend the previous set of experiments on DNAContainer with new ones whose results reveal the impact of essential parameters on the performance and memory footprint.

The Doctrine of MEAN: Realizing Deduplication Storage at Unreliable Edge

Article

Full-text available

Oct 2023
IEEE T PARALL DISTR

Placing popular data at the network edge helps reduce the retrieval latency, but it also brings challenges to the limited edge storage space. Currently, using available yet not necessarily reliable edge resources is common sense for edge space expansion, while deploying deduplication storage strategies is a general method for better space utilization. However, a contradiction arises when jointly implementing data deduplication with unreliable edge resources. On the one hand, the deduplication policy stipulates that any data chunk can be stored exactly once; on the other hand, the use of unreliable resources imposes that data should be backed up for the seek of file availability. To resolve such contradiction, we propose MEAN, a deduplication-enabled storage system using unreliable resources at the network edge. The core idea of MEAN is to place similar files together for better deduplication and maintain replicas of popular files for higher reliability. We first formulate this problem and prove its NP-hardness, then provide efficient heuristics based on similarity-aware hierarchical clustering. Three different reliability scenarios are comprehensively considered to develop our algorithms. We also implement a prototype system and evaluate the performance of MEAN with a real-world dataset. The results show that MEAN can fortify the file hit ratio under unreliable environments by 77% while reducing the file retrieval delay up to 71%, compared with the state-of-the-art approach.

When Deduplication Meets Migration: An Efficient and Adaptive Strategy in Distributed Storage Systems

Article

Full-text available

Oct 2023
IEEE T PARALL DISTR

The traditional migration methods are confronted with formidable challenges when data deduplication technologies are incorporated. Firstly, the deduplication creates data-sharing dependencies in the stored files; breaking such dependencies in migration may attach extra space overhead. Secondly, the redundancy elimination makes the storage system reserves only one copy for each storage file, and heightens the risk of data unavailability. The existing methods fail to tackle them in one shot. To this end, we propose Jingwei, an efficient and adaptive data migration strategy for deduplicated storage systems. To be specific, Jingwei tries to minimize the extra space cost in migration for space efficiency. Meanwhile, Jingwei realizes the service adaptability by encouraging replicas of hot files to spread out their data access requirements. We first model such a problem as an integer linear programming (ILP) and solve it with a commercial solver when only one empty migration target server is allowed. We then extend this problem to a scenario wherein multiple non-empty target servers are available for migration. We solve it by effective heuristic algorithms based on the Bloom Filter-based data sketches. The Jingwei strategy can suffer from performance degradation when the heat degree varies significantly. Therefore, we further present incremental adjustment strategies for the two scenarios, which adjust the number of block replicas and their locations in an incremental manner. The mathematical analyses and trace-driven experiments show the effectiveness of our Jingwei strategy. To be specific, Jingwei fortifies the file replicas by 25% with only 5.7% of the extra storage space, compared with the latest “Goseed” method. With the small extra space cost, the file retrieval throughput of Jingwei can reach up to 333.5Mbps, which is 12.3% higher than that of the Random method.

A Shifting Filter Framework for Dynamic Set Queries

Article

Full-text available

Oct 2023
IEEE ACM T NETWORK

Set query is a fundamental problem in computer systems. Plenty of applications rely on the query results of membership, association, and multiplicity. A traditional method that addresses such a fundamental problem is derived from Bloom filter. However, such methods may fail to support element deletion, require additional filters or apriori knowledge, making them unamenable to a high-performance implementation for dynamic set representation and query. In this paper, we envision a novel sketch framework that is multi-functional, non-parametric, space efficient, and deletable. As far as we know, none of the existing designs can guarantee such features simultaneously. To this end, we present a general shifting framework to represent auxiliary information (such as multiplicity, association) with the offset. Thereafter, we specify such design philosophy for a hash table horizontally at the slot level, as well as vertically at the bucket level. Theoretical and experimental results jointly demonstrate that our design works exceptionally well with three types of set queries under small memory.

Hierarchical structure in a 16-bit word of MPCBF [50]. Initially, the number of hash functions k=3, and level 1 has 8 bits which have been initialized as 0.

Contexts in source publication

Similar publications

Citations