Fig 4 - uploaded by Fan Zhang
Content may be subject to copyright.
Distributing tasks to accelerators.

Distributing tasks to accelerators.

Source publication
Article
Full-text available
In this article we describe a Field Programmable Gate Array (FPGA)-based coprocessor architecture for Frequent Itemset Mining (FIM). FIM is a common data mining task used to find frequently occurring subsets amongst a database of sets. FIM is a nonnumerical, data intensive computation and is used in machine learning and computational biology. FIM i...

Context in source publication

Context 1
... instance two 256-bit accelerators for Bank A and B and a 128-bit accelerator for Bank C (an inher- ent limitation of the platform) on each of the four FPGAs. As shown in Figure 4, in order to distribute the workload to the pool of PEs, we dynamically assign each of the topmost branches of the depth-first search to each pro- cessing element. In other words, each PE is assigned a branch with a different starting item. ...

Similar publications

Conference Paper
For computer-aided hardware design, models are usually used to evaluate the designed systems. But there is still a gap between models and their efficient implementations on a real architecture, like FPGAs. For example, some model characteristics may lead to a waste of resources, which can even make a design infeasible. In this paper, we focus on ho...

Citations

... Eclat-based algorithms [59,74,75,102,131,132] use the vertical dataset representation to reduce the memory and processing runtime overhead. This representation is adequate to be used in hardware, because it uses the intersection of itemsets to compute their support, and it is more efficient than hash-trees used in other methods. ...
... In Apriori and Eclat, using the vertical representation accelerates the support counting by saving the occurrence information for the counted candidates; however, they introduce a memory overhead. For the Eclat-based approaches, FPGA [59,74,75,102,131,132] were preferred over GPU [131], and this is useless in data stream mining due to the Infinity restriction (see Section 2). The literature revised shows that Eclat-based approaches have the same computational complexity as the original Eclat, which demonstrated the general trend in the literature of proposing hardware-friendly versions of Eclat instead of proposing new FIM algorithms. ...
... The itemsets intersection was implemented through a hardware-based comparison matrix, which accelerates and parallelizes this operation. In 2013 Zhang et al. proposed a new architecture based on Eclat[132]. Unlike the previous efforts, the architecture proposed by Zhang et al. is not restricted by the logical resources available on the selected FPGA. ...
Article
In data mining, Frequent Itemsets Mining is a technique used in several domains with notable results. However, the large volume of data in modern datasets increases the processing time of Frequent Itemset Mining algorithms, making them unsuitable for many real-world applications. Accordingly, proposing new methods for Frequent Itemset Mining to obtain frequent itemsets in a realistic amount of time is still an open problem. A successful alternative is to employ hardware acceleration using Graphics Processing Units (GPU) and Field Programmable Gates Arrays (FPGA). In this article, a comprehensive review of the state of the art of Frequent Itemsets Mining hardware acceleration is presented. Several approaches (FPGA and GPU based) were contrasted to show their weaknesses and strengths. This survey gathers the most relevant and the latest research efforts for improving the performance of Frequent Itemsets Mining regarding algorithms advances and modern development platforms. Furthermore, this survey organizes the current research on Frequent Itemsets Mining from the hardware perspective considering the source of the data, the development platform, and the baseline algorithm.
... For assessing the execution of FIN, they analysed and to contrast it with PrePost algorithm and FP-growth algorithm, both are best in class algorithms, on an assortment of realistic and airtificial nodesets/datasets. Zhang et al. (2013), they proposed a Field Programmable Gate Array (FPGA)-based coprocessor architecture for Frequent Itemset Mining (FIM). FIM is a typical process in data mining assist to find frequently happening of subsets between a major amount of datasets. ...
Article
Full-text available
Mining the frequent pattern deals with the finding patterns in large set of data, subsequences and substructures that occur in a database frequently. Likewise, We can use Frequent pattern mining for MANET nodes in order to identify the paths which are participated in frequent data transaction among the various Mobile adhoc network nodes. The network data stream is a long and continuous sequence of data sets transmitted over the network. The OCA (Online Combinatorial Approximation) algorithm is used in the data stream for mining online data. The processing time of OCA was much less and accuracy of its approximate result was quite high like other traditional mining methods. The Data Path Combinatorial Approximation (DPCA) algorithm deals with a frequent pathset mining over the MANET data flow. The pathset is generation of path from the set of paths on any node which are provided paths to various other nodes participating in the data transmission. The mining algorithm is based on Approximate Inclusion–Exclusion technique. Without continual path scanning, approximate counts are calculated for the pathsets. Skip and complete technique and group count technique were combined together and integrated into the DPCA algorithm to improve the MANET performance in terms of identifying fool around (misbehaving) nodes.
... While such approaches exist in the literature, they are designed solely for frequent itemset mining, and cannot easily be adapted to other variants of the problem such as pattern sampling. We thus started from an existing adaptation of the ECLAT algorithm on FPGA [1] and revisited the algorithm and the FPGA architecture of this approach, in order to have a solid algorithm basis on which to build our pattern sampling contributions. Our improvement helps in making the traversal of the enumeration tree of ECLAT more predictable, i.e. we provide a simple function that allows to determine exactly, from any candidate itemset, the next itemset to visit. ...
... This distinction is important for this document, as we provide a new output space sampling implementation and its acceleration on FPGA platforms. 1 Finding outliers in a database, rare patterns with a significantly different distribution. 2 Finding shifts of distribution of time series. ...
... As the ECLAT algorithm is rather standard in the itemset mining community, there exist already several FPGA implementations of it. Our implementation is heavily based on [1], providing some improvements. We then describe how to add the output space sampling feature to the accelerator, and how well its implementation on FPGA compare against the current state of the art. ...
Thesis
The field of frequent pattern mining aims to discover recurring patterns from a given database. Many pattern mining approaches are available in the scientific literature, yet most of them suffer from the same drawback: there can be many output results, which contain highly redundant information. This makes such results hard to analyze. A technique called output space sampling has recently being used along frequent pattern mining for this very reason. Output space sampling consists in returning a bounded sample of the results, with statistical guarantees that ensure it is representative of the complete output. In a field where fast adaptation to trends is prevalent, an imperfect real-time analysis can be preferable over exhaustive offline analysis. To this aim, the thesis focuses its work on dedicated hardware architectures, more energy and time efficient than commonly used servers. The first contribution of the thesis is a frequent pattern mining accelerator for FPGA architectures. The proposed solution allow for a greater architectural flexibility, while reducing the cost of on-Chip memory, a scarce resource for the architecture. This first contribution proposes algorithmic improvements, to allow for a regularisation of the explored research space suited for efficient computing on FPGA. Furthermore, we propose an FPGA accelerator able to manage the heavy load of communication with its external memory. The second contribution extends the first one, restricted to static databases, to streaming databases. This requires to reconsider the theoretical basis of the sampling technique, as the value of the sample must be representative of the most recent snapshot of the stream, but also of the important trends in the close past of the stream.
... Table 1 shows the speedup achieved by GPU implementations that goes from 4x to 30x compared to optimized software implementations of Apriori and Eclat (Borgelt, 2003). In contrast, the best FPGA hardware architectures reported a speedup of up to 68x (Zhang et al., 2013b) compared to the same software implementations. ...
Article
Full-text available
Algorithms for Frequent Itemsets Mining have proved their effectiveness for extracting frequent sets of patterns in datasets. However, in some specific cases, they do not obtain the expected results in an acceptable time. For this reason, Field Programmable Gates Array-based architectures for Frequent Itemsets Mining have been proposed to accelerate this task. The current paper proposes a search strategy for Frequent Itemsets Mining based on equivalence classes partitioning. The partitioning on equivalence classes allows dividing the search space into disjoint sets that can be processed in parallel. Consequently, this paper presents the design and implementation of two hardware architectures that exploit the nested parallelism in the proposed search strategy. These hardware architectures are capable of obtaining frequent itemsets regardless of the number of distinct items and the number of transactions in the dataset, which are the main issues reported in the reviewed literature. Furthermore, the proposed architectures explore the trade-off between acceleration and hardware resource utilization. The experimental results obtained demonstrate that the proposed search strategy can be scaled to achieve a speedup in the processing time of 40 times faster than software-based implementations.
... Also, the Apriori algorithm meets none of the three constraints for data streams mining algorithms. Eclat-based architectures [28,41] use the vertical dataset representation to save memory and processing time. In [41], authors use the intersection of items 155 to compute the support showing that this is more efficient than using hash- Previous approaches were focused to discover frequent itemsets mining in datasets. ...
... Eclat-based architectures [28,41] use the vertical dataset representation to save memory and processing time. In [41], authors use the intersection of items 155 to compute the support showing that this is more efficient than using hash- Previous approaches were focused to discover frequent itemsets mining in datasets. Following approaches are oriented to obtain frequent itemsets mining in data streams. ...
Article
Frequent Itemsets Mining is a Data Mining technique that has been employed to extract useful knowledge from datasets and, more recently, also from data streams. Data streams are unbounded and infinite flows of data arriving at high rates which cannot be stored for off-line processing; therefore, proposed algorithms for Frequent Itemsets Mining approaches from datasets cannot be used straightforwardly for Frequent Itemsets Mining from data streams. Frequent Itemsets Mining is a compute intensive task, hence developing custom hardware-based architectures to speed up this process is an active research topic. This paper introduces an algorithm for a hardware-based Frequent Itemsets Mining on data streams that uses the top-k frequent 1-itemsets detection as preprocessing. The received transactions are handled using hash functions, and the lexicographic order of items is used for obtaining frequent itemsets. The proposed algorithm is focused on discovering frequent itemsets in data streams composed of short transactions in large alphabets. Experimental results demonstrate that the proposed algorithm outperforms the processing time of the state-of-the-art algorithms used as the baseline.
... Using the vertical layout, the frequency of an itemset is obtained by intersecting the vectors that compose the itemset. In consequence, hardware-based implementations of Eclat (Shaobo et al. 2013;Zhang et al. 2013) also use the vertical dataset representation to save memory space and processing time. In the vertical dataset representation, the items intersection can be implemented in hardware by using logical AND operations. ...
... In the vertical dataset representation, the items intersection can be implemented in hardware by using logical AND operations. In Shaobo et al. (2013) and Zhang et al. (2013), authors propose a software-hardware architecture where the most time and memory consuming functions were downloaded to hardware while software controls the execution flow and data structures. Although the vertical dataset representation allows saving memory and processing time, it is not compatible with the Expiration constraint because all data involved in the vectors intersection must be known before processing. ...
... Literature reviewFrequent Itemsets Mining is a widely used Data Mining technique(Aggarwal and Han 2014). Recent research efforts have been focused on accelerating the discovering of frequent itemsets over transactional datasets using hardware devices (such as FPGAs)(Baker and Prasanna 2005, 2006;Bustio et al. 2015;Mesa et al. 2010;Shaobo et al. 2013; Zambreno 2008, 2011;Thöni and Strey 2009;Wen et al. 2008;Zhang Content courtesy of Springer Nature, terms of use apply. Rights reserved. ...
Article
Full-text available
Frequent Itemsets Mining has been applied in many data processing applications with remarkable results. Recently, data streams processing is gaining a lot of attention due to its practical applications. Data in data streams are transmitted at high rates and cannot be stored for offline processing making impractical to use traditional data mining approaches (such as Frequent Itemsets Mining) straightforwardly on data streams. In this paper, two single-pass parallel algorithms based on a tree data structure for Frequent Itemsets Mining on data streams are proposed. The presented algorithms employ Landmark and Sliding Window Models for windows handling. In the presented paper, as in other revised papers, if the number of frequent items on data streams is low then the proposed algorithms perform an exact mining process. On the contrary, if the number of frequent patterns is large the mining process is approximate with no false positives produced. Experiments conducted demonstrate that the presented algorithms outperform the processing time of the hardware architectures reported in the state-of-the-art.
... Recently, a number of hardware-accelerated solutions for FIM algorithms have been developed [5], [6]. In [5], a state-of-the-art, to our knowledge, FPGA implementation was proposed to accelerate the Eclat algorithm on a four-FPGA board with a binary representation of itemsets. ...
... Recently, a number of hardware-accelerated solutions for FIM algorithms have been developed [5], [6]. In [5], a state-of-the-art, to our knowledge, FPGA implementation was proposed to accelerate the Eclat algorithm on a four-FPGA board with a binary representation of itemsets. In [6], a GPU-accelerated algorithm, called Frontier Expansion (FE), was proposed. ...
... We compare the performance of the FPGA implementation over a range of minimum support values against the implementations in [6] on both multi-core CPU and GPU. We also present preliminary results using the Kintex UltraScale XCKU115 FPGA, which has larger resources and higher memory bandwidth, and compare our implementation with the HDL FPGA implementation of the Eclat algorithm in [5]. ...
... A similar acceleration was also performed on FPGA [Zhang et al. 2013b]. The main difference with GPU acceleration stems from the lack of memory on the FPGA. ...
... They are used in [Wang et al. 2015], enabling to compare against an implementation of the Apriori algorithm using Micron Automata Processor. T40I10D03N500K, T40I10D03N1000K, T60I20D05N500K and T90I20D05N500K are used in [Zhang et al. 2013b]. This enables to compare against an FPGA-accelerated implementation of the Eclat algorithm. ...
... Fortunately, these so called long-tailed dataset are very frequent, in particular in Web data and retail data [Anderson 2006]. Table IV shows execution times of our VC709 platform, compared to an FPGA-accelerated implementation of the Eclat algorithm [Zhang et al. 2013b]. The configuration with 9360 units is used for all datasets except T40I10D03N1000K where the 11636-units configuration is used due to the low number of frequent items. ...
Article
Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously and are mined to extract patterns characterizing the data. A class of data mining algorithm, called generate-and-test, produces a set of candidate patterns that are then evaluated over data. The main challenges of these algorithms are to achieve high throughput, low latency, and reduced power consumption. In this article, we present a novel power-efficient, fast, and versatile hardware architecture whose objective is to monitor a set of target patterns to maintain their frequency over a stream of data. This accelerator can be used to accelerate data-mining algorithms, including itemsets and sequences mining. The massive fine-grain reconfiguration capability of field-programmable gate array (FPGA) technologies is ideal to implement the high number of pattern-detection units needed for these intensive data-mining applications. We have thus designed and implemented an IP that features high-density FPGA occupation and high working frequency. We provide detailed description of the IP internal micro-architecture and its actual implementation and optimization for the targeted FPGA resources. We validate our architecture by developing a co-designed implementation of the Apriori Frequent Itemset Mining (FIM) algorithm, and perform numerous experiments against existing hardware and software solutions. We demonstrate that FIM hardware acceleration is particularly efficient for large and low-density datasets (i.e., long-tailed datasets). Our IP reaches a data throughput of 250 million items/s and monitors up to 11.6k patterns simultaneously, on a prototyping board that overall consumes 24W in the worst case. Furthermore, our hardware accelerator remains generic and can be integrated to other generate and test algorithms.
... Eclat-based algorithms [39,28] use the vertical dataset representation in order to save memory and processing time. In [39,28], authors use the intersection of items to compute the support. ...
... Eclat-based algorithms [39,28] use the vertical dataset representation in order to save memory and processing time. In [39,28], authors use the intersection of items to compute the support. They show that it is more efficient than hashtrees. ...
Conference Paper
Full-text available
Data streams are unbounded and infinite flows of data arriving at high rates which cannot be stored for offline processing. Because of this, classical approaches for Data Mining cannot be used straightforwardly in data stream scenario. This paper introduces a single-pass hardware-based algorithm for frequent itemsets mining on data streams that uses the top-k frequent 1-itemsets. Experimental results of the hardware implementation of the proposed algorithm are also presented and discussed.
... There are several hardware implementations of Frequent Itemset Mining algorithms based on GPU and FPGAs used like co processors to take in charge specific tasks by example counting support and another more complicated that perform a full implementation of Frequent Itemset Mining algorithms [3,4,5]. Most of the reported work in literature has been though for a fixed problem size, in others words they have a limit in the number of Frequent Item-sets that might be processed by the Architecture, limited by the resources of the employed device. ...
Poster
Full-text available
Frequent Itemset Mining algorithms have proved their effectiveness to extract all the frequent itemsets in datasets, however, in some cases, they do not produce the expected results in an acceptable time according to the application requirements. For this reason, FPGA-based hardware architectures for Frequent Itemset Mining have been proposed in the literature to accelerate this task. Most of them are limited by the number of distinct items that could be processed and the available resources in the employed FPGA device. This study proposes a compact hardware architecture itemset mining capable of mining all the frequent itemsets regardless of the distinct items and the dataset. The design implements a partition strategy based on equivalence classes. The partition on equivalence classes into disjoint sets that can in parallel. Accordingly, a parallel architecture is proposed to exploit the benefits of the proposed search strategy.