Table 1 - uploaded by Polepalli Krishna Reddy
Content may be subject to copyright.
Transaction dataset.

Transaction dataset.

Source publication
Conference Paper
Full-text available
In this paper we have proposed an improved approach to extract rare association rules. The association rules which involve rare items are called rare association rules. Mining rare association rules is difficult with single minimum support (minsup) based approaches like Apriori and FP-growth as they suffer from "rare item problem" dilemma. At high...

Contexts in source publication

Context 1
... the dataset shown in Table 1, the extraction of frequent patterns using CFP-growth algorithm is illustrated using Example 1. For ease of explaining this example we refer the support and MIS values of the items in terms of support counts and MIS counts. ...
Context 2
... 1: For the transaction dataset shown in Table 1, the itemset I = {bread, ball, jam, bat, pil- low, bed, pencil, pen}. Let the MIS values (in count) for bread, ball, jam, bat, pillow, bed, pen- cil and pen be 4, 4, 3, 3, 2, 2, 2, 2. Now, using the MIS values for the items, the CFP-growth ap- proach sorts the items in descending order of their MIS values and assigns the frequency value of zero to every item. ...
Context 3
... L 1 contain {{bread:0}, {ball:0}, {jam:0}, {bat:0}, {pillow:0}, {bed:0}, {pencil:0}, {pen:0}}. In the first scan of the dataset shown in Table 1, the first transaction "1: bread, jam" containing two items is scanned in L 1 order i.e., {bread, jam} and the frequencies of items "bread" and "jam" are updated by 1 in L 1 . Next, a first branch of tree is constructed with two nodes, bread: 1 and jam: 1, where "bread" is linked as a child of the root and "jam" is linked as a child of "bread". ...

Similar publications

Conference Paper
Almost all the approaches in association rule mining suggested the use of a single minimum support, technique that either rules out all infrequent itemsets or suffers from the bottleneck of generating and examining too many candidate large itemsets. In this paper we consider the combination of two well-known algorithms, namely algorithm DIC and MSA...
Article
Full-text available
Mining association rules is one of the most important and popular task in data mining. Current researches focus on discovering frequent itemsets that is an important step to it. Many algorithms for discovering frequent itemsets have been proposed. However, for a large database, an efficient mining algorithm must be a better balance in I/O cost and...

Citations

... CFP-Growth [16] is a shared memory parallel algorithm to compute frequent patterns, it's based on FP-Growth. It is founded on FP-tree partition techniques and balancing of effective loading. ...
Conference Paper
Nowadays, the explosive growth in data collection in business and scientific areas has required the need to analyze and mine useful knowledge residing in these data. The recourse to data mining techniques seems to be inescapable in order to extract useful and novel patterns/models from large datasets. In this context, frequent itemsets (patterns) play an essential role in many data mining tasks that try to find interesting patterns from datasets. However, conventional approaches for mining frequent itemsets in Big Data era encounter significant challenges when computing power and memory space are limited. This paper proposes an efficient distributed frequent itemset mining algorithm, called ParallelCharMax, that is based on a powerful sequential algorithm, called Charm, and computes the maximal frequent itemsets that are considered perfect summaries of the frequent ones. The proposed algorithm has been implemented using MapReduce framework. The experimental component of the study shows the efficiency and the performance of the proposed algorithm compared with well known algorithms such as MineWithRounds and HMBA.
... Therefore, we have designed a kind of multilevel dynamic index structure to store the above information [12], extracted semantic strings after frequent patterns discovery and evaluation. Frequent pattern discovery is the improvement of pattern-growth algorithm [13], mainly are the following steps: ...
Article
Full-text available
This paper studies Uyghur single text summarization and proposes some of new or improved approaches in the aspects of keyword extraction and evaluation, sentence selection and redundancy removal, also in readability improvement and so on. Proposes an improved frequent pattern-growth approach to extract the semantic strings which perfect both on its semantics and structural integrity, to evaluate this strings uses multi-feature fusion approach and select most important ones as keywords to describe the text theme effectively. In the aspect of sentence similarity and redundancy removal, proposes the idea of theme including degree, so as to effectively remove the redundant sentences and improves the summary quality significantly. Also introduces sentence alignment between the texts that after being stemming and original text, so as to solve the problems that summary naturalness, coherence and comprehensibility decline and other issues caused by stemming process.
... Those algorithms tried to find out all rare itemsets, but they spent most of the time in searching for non-rare itemsets which tends to provide uninteresting association rules. To emphasize the "rare item problem", efforts have been made in the literature to discover frequent patterns using "multiple minimum support framework" [4,10,[16][17][18][19][20]. As per various user and application requirement, different models have been suggested in this framework. ...
... As per various user and application requirement, different models have been suggested in this framework. They are: (i) minimum constraint model [8,[16][17][18][19] (ii) maximum constraint model [20]. ...
... As per sorted closure property, "all non-empty subsets of a frequent pattern need not be frequent, only the subsets consisting of the item having lowest MIS value within it should be frequent". Hence, based on this model Apriori-like [8,16] or FP-growth-like [17][18][19] approaches consider frequent and infrequent patterns. The sorted closure property was briefly explored in [8]. ...
Article
Full-text available
Pattern mining methods describe valuable and advantageous items from a large amount of records stored in the corporate datasets and repositories. While mining, literature has almost singularly focused on frequent itemset but in many applications rare ones are of higher interest. For Example medical dataset can be considered, where rare combination of prodrome plays a vital role for the physicians. As rare items contain worthwhile information, researchers are making efforts to examine effective methodologies to extract the same. In this paper, an effort is made to analyze the complete set of rare items for finding almost all possible rare association rules from the dataset. The Proposed approach makes use of Maximum constraint model for extracting the rare items. A new approach is efficient to mine rare association rules which can be defined as rules containing the rare items. Based on the study of relevant data structures of the mining space, this approach utilizes a tree structure to ascertain the rare items. Finally, it is demonstrated that this new approach is more virtuous and robust than the existing algorithms.
... Those algorithms try to find out all rare itemsets but they spent most of the time in searching for non-rare itemsets which tends to provide uninteresting association rules. To address the "rare item problem", "multiple minsup framework" [6,[14][15][16][17][18][19][20] was used to discover rare association rules. Different models were proposed in this framework. ...
... Different models were proposed in this framework. They are: (i) minimum constraint model [6,15,17,18] (ii) maximum constraint model [14]. ...
... As per sorted closure property, "all non-empty subsets of a frequent pattern need not be frequent, only the subsets consisting of the item having lowest MIS value within it should be frequent". Hence, based on this model Apriori-like [6,17] or FP-growth-like [15,18,19] approaches consider frequent and infrequent patterns. The sorted closure property was briefly explored in [6]. ...
Conference Paper
Full-text available
Rare association rule mining provides useful information from large database. Traditional association mining techniques generate frequent rules based on frequent item sets with reference to user defined: minimum support threshold and minimum confidence threshold. It is known as support-confidence framework. As many of generated rules are of no use, further analysis is essential to find interesting Rules. Rare association rule contains Rare Items. Rare Association Rules represents unpredictable or unknown associations, so that it becomes more interesting than frequent association rule mining. The main goal of rare association rule mining is to discover relationships among set of items in a database that occurs uncommonly. We have proposed a Maximum Constraint based method for generating rare association rule with tree structure. Tentative results show that MCRP-Tree takes less time for rule generation compared to the existing algorithm as well as it finds more interesting rare items.
... Those algorithms try to find out all rare itemsets but they spent most of the time in searching for nonrare itemsets which tends to provide uninteresting association rules. To address the "rare item problem", "multiple minsup framework" [6,[14][15][16][17][18][19][20] was used to discover rare association rules. Different models were proposed in this framework. ...
... Different models were proposed in this framework. They are: (i) minimum constraint model [6,15,17,18] (ii) maximum constraint model [14]. ...
... As per sorted closure property, "all non-empty subsets of a frequent pattern need not be frequent, only the subsets consisting of the item having lowest MIS value within it should be frequent". Hence, based on this model Apriori-like [6,17] or FP-growth-like [15,18,19] approaches consider frequent and infrequent patterns. The sorted closure property was briefly explored in [6]. ...
Conference Paper
Full-text available
Rare association rule mining provides useful information from large database. Traditional association mining techniques generate frequent rules based on frequent itemsets with reference to user defined: minimum support threshold and minimum confidence threshold. It is known as support-confidence framework. As many of generated rules are of no use, further analysis is essential to find interesting Rules. Rare association rule contains Rare Items. Rare Association Rules represents unpredictable or unknown associations, so that it becomes more interesting than frequent association rule mining. The main goal of rare association rule mining is to discover relationships among set of items in a database that occurs uncommonly. We have proposed a Maximum Constraint based method for generating rare association rule with tree structure. Tentative results show that MCRP-Tree takes less time for rule generation compared to the existing algorithm as well as it finds more interesting rare items.
... Those algorithms try to find all rare itemsets but they spend most of time for searching non-rare itemsets which tends to give us uninteresting association rules. To address the "rare item problem", "multiple minsup framework" [4,13,22,[27][28][29][30][31] is used to determine rare rules. Different models are proposed in this framework. ...
... Different models are proposed in this framework. They are: (i) minimum constraint model [4,13,22,29] (ii) maximum constraint model [27] and (iii) other models [28,31]. ...
... As per sorted closure property, "all non-empty subsets of a frequent pattern need not be frequent, only the subsets consisting of the item having lowest minimum item support value within it should be frequent". Hence, based on this model Apriori-like [4,22] or FP-growth-like [13,29,30] approaches consider frequent and infrequent patterns. The sorted closure property was briefly explored in [4]. ...
Article
association rules are mine useful information form large dataset. Traditional association mining methods generate frequent rules based on frequent itemsets with reference of minimum support and minimum confidence threshold which specified by user. It called as support-confidence framework. As many of generated rules are of no use, further analysis is essential to find interesting Rules. A rule that contains rare items can consider as rare association rule. Rare Association Rules Represent unpredictable or unknown association, so it is more interesting than frequent association rule. Rare association rule mining provides relationship between items which occurs uncommonly. This paper presents brief survey in the area of rare association rule mining. Keywordspattern, support, confidence, Rare Items
... Rather setting up single minsup value for the whole transaction set, multiple values of minsup are specified for each item in the transaction set [4]. ...
... In 2009, Kiran et.al. [4], proposed a preliminary algorithm to improve the performance of CFP-growth by suggesting two pruning techniques for reducing the size of constructed tree structure. CFPGrowth++ algorithm work with the multiple support value to find out the rare and frequent patterns. ...
... In addition, the I/O cost can be greatly reduced since the size of the trimmed dataset is much smaller than the original one. More specifically, the data trimming technique works under a framework that consists three modules: the trimming module (Local Trimming Strategy), pruning module (Global Pruning Strategy) and patch up module (Single-pass Patch Up Strategy) [4]. ...
Article
Full-text available
Association rule mining plays a major role in decision making in the production and sales business area. It uses minimum support (minsup) and support confidence (supconf) as a base to generate the frequent patterns and strong association rules. Setting a single value of minsup for a transaction set doesn't seem feasible for some real life applications. Similarly the probabilistic value of items in the transaction set may be acceptable. So generating the frequent pattern from the uncertain dataset becomes a concern factor. This research work details the aforesaid problem and proposes a solution for the same.
... An effort has been made to extend the notion of multiple constraints to extract periodic-frequent patterns [12]. In [7], we have proposed a preliminary algorithm to improve the performance of CFP-growth by suggesting two pruning techniques for reducing the size of constructed tree structure . It is to be noted that the algorithm discussed in [7] performs exhaustive search, like CFP-growth, to discover complete set of frequent patterns as the frequent patterns mined with " multiple minsups framework " do not satisfy downward closure property. ...
... In [7], we have proposed a preliminary algorithm to improve the performance of CFP-growth by suggesting two pruning techniques for reducing the size of constructed tree structure . It is to be noted that the algorithm discussed in [7] performs exhaustive search, like CFP-growth, to discover complete set of frequent patterns as the frequent patterns mined with " multiple minsups framework " do not satisfy downward closure property. In this paper, we investigated approaches to reduce the search space while extracting frequent patterns and proposed two additional pruning techniques which significantly reduces the search space by avoiding exhaustive search while extracting frequent patterns from a tree structure. ...
Conference Paper
Full-text available
Frequent patterns are an important class of regularities that exist in a transaction database. Certain frequent patterns with low minimum support (minsup) value can provide useful information in many real-world applications. However, extraction of these frequent patterns with single minsup-based frequent pattern mining algorithms such as Apriori and FP-growth leads to "rare item problem." That is, at high minsup value, the frequent patterns with low minsup are missed, and at low minsup value, the number of frequent patterns explodes. In the literature, "multiple minsups framework" was proposed to discover frequent patterns. Furthermore, frequent pattern mining techniques such as Multiple Support Apriori and Conditional Frequent Pattern-growth (CFP-growth) algorithms have been proposed. As the frequent patterns mined with this framework do not satisfy downward closure property, the algorithms follow different types of pruning techniques to reduce the search space. In this paper, we propose an efficient CFP-growth algorithm by proposing new pruning techniques. Experimental results show that the proposed pruning techniques are effective.
... ii. An efficient pattern-growth approach is proposed based on various heuristics to minimize the search space for finding the complete set of frequent patterns [9]. ...
... The first two steps i.e., construction of MIS-tree and compact MIS-tree are similar to those in [7,9]. ...
... However, mining frequent patterns from the compact MIS-tree is different from [7,9]. ...
Thesis
Full-text available
Currently, extracting knowledge pertaining to rare cases that are hidden in the large datasets has become an important research problem. Frequent patterns are an important class of regularities that exist in a transactional database. The frequent patterns containing rare items can provide useful knowledge. It is difficult to mine frequent patterns containing both frequent and relatively infrequent (or rare) items, because, single minimum support (minsup) based frequent pattern mining approaches such as Apriori and FP-growth suffer from rare item problem. That is, at high minsup, we miss the frequent patterns containing rare items, and at low minsup, combinatorial explosion can occur, producing too many frequent patterns. To address rare item problem, efforts have been made in the literature to find frequent patterns by using "multiple minsups framework." Even though this framework address rare item problem, but still suffers from performance problems. In this thesis, we have made an effort to propose efficient approaches to extract frequent patterns containing both frequent and rare items. The contribution of thesis is as follows: (i) We have proposed the notion "support difference" and proposed an efficient methodology to extract frequent patterns containing both frequent and rare items (ii) An efficient multiple minsups-based FP-growth-like algorithm has been proposed by introducing several heuristics to minimize the search space. (iii) An improved "multiple minsups framework" has been proposed by introducing the notion called "item-to-pattern difference." (iv) We have also proposed an improved periodic-frequent pattern mining algorithm by extending the notion of "multiple constraints." ii
... To find frequent patterns consisting of rare items, let us specify low minsup, say minsup = 2. The frequent patterns generated at minsup = 2 are shown in Figure 1(b To improve the performance of mining frequent patterns consisting of both frequent and rare items, efforts are being made to discover frequent patterns using "multiple minsup framework" [7,9,11,12]. Independent of the detailed implementation technique, the model used in these approaches is as follows. ...
... where, λ represents the parameter like mean, median of the item supports and β ∈ [0, 1]. In [12], an effort has been made to improve the performance of [9] by efficiently identifying only those items which can generate frequent patterns. ...
... The approaches discussed in [7,9,11,12] are based on minimum constraint model. Therefore, these approaches can efficiently prune uninteresting patterns which have low support and contain only frequent items. ...
Conference Paper
Rare association rule is an association rule consisting of rare items. It is difficult to mine rare association rules with a single minimum support (minsup) constraint because low minsup can result in generating too many rules in which some of them can be uninteresting. In the literature, minimum constraint model using “multiple minsup framework” was proposed to efficiently discover rare association rules. However, that model still extracts uninteresting rules if the items’ frequencies in a dataset vary widely. In this paper, we exploit the notion of “item-to-pattern difference” and propose multiple minsup based FP-growth-like approach to efficiently discover rare association rules. Experimental results show that the proposed approach is efficient.