Conference PaperPDF Available

A Survey of Association Rule Hiding Algorithms

April 2014

April 2014

DOI:10.1109/CSNT.2014.86

Conference: 2014 International Conference on Communication Systems and Network Technologies (CSNT)

Authors:

Vikram Garg

ABV-Indian Institute of Information Technology and Management Gwalior

Anju Singh

LNCT group of Colleges Bhopal

Divakar Singh

Barkatullah University

The significant development in field of data collection and data storage technologies have provided transactional data to grow in data warehouses that reside in companies and public sector organizations. As the data is growing day by day, there has to be certain mechanism that could analyze such large volume of data. Data mining is a way of extracting the hidden predictive information from those data warehouses without revealing their sensitive information. Privacy preserving data mining (PPDM) is the recent research area that deals with the problem of hiding the sensitive information while analyzing data. Association Rule Hiding is one of the techniques of PPDM to hide association rules generated by Association Rule Generation Algorithms. In this paper we will provide a comparative theoretical analysis of Algorithms that have been developed for Association Rule Hiding.

Content uploaded by Vikram Garg

Content may be subject to copyright.

A preview of the PDF is not available

Enabling Privacy-Preserving Rule Mining in Decentralized Social Networks

Conference Paper

Aug 2021

Decentralized online social networks enhance users’ privacy by empowering them to control their data. However, these networks mostly lack for practical solutions for building recommender systems in a privacy-preserving manner that help to improve the network’s services. Association rule mining is one of the basic building blocks for many recommender systems. In this paper, we propose an efficient approach enabling rule mining on distributed data. We leverage the Metropolis-Hasting random walk sampling and distributed FP-Growth mining algorithm to maintain the users’ privacy. We evaluate our approach on three real-world datasets. Results reveal that the approach achieves high average precision scores (> 96%) for as low as 1% sample size in well-connected social networks with remarkable reduction in communication and computational costs.

Efficient Privacy-Preserving Recommendations based on Social Graphs

Conference Paper

Full-text available

Jun 2019

Recommender systems use association rules mining, a technique that captures relations between user interests and recommends new potential ones accordingly. Applying association rule mining causes privacy concerns as user interests may contain sensitive personal information (e.g., political views). This potentially even in- hibits the user from providing information in the fi rst place. Current distributed privacy-preserving association rules mining (PPARM) approaches use cryptographic primitives that come with high com- putational and communication costs, rendering PPARM unsuitable for large-scale applications such as social networks. We propose im- provements in the effi ciency and privacyof PPARM approaches by minimizing the required data. We propose and compare sampling strategies to sample the data based on social graphs in a privacy- preserving manner. The results on real-world datasets show that our sampling-based approach can achieve a high average precision score with as low as 50% sampling rate and, therefore, with a 50% reduction of communication cost.

Particle Swarm Intelligence and Impact Factor-Based Privacy Preserving Association Rule Mining for Balancing Data Utility and Knowledge Privacy

Article

Full-text available

Sep 2017

Organizations generally prefer data or knowledge sharing with others to obtain mutual benefits. The major issue in sharing the data or knowledge is data owners privacy requirements. Privacy preserving association rule mining is an area in which data owner can protect private association rules (sensitive knowledge) from disclosure while sharing the data. To safeguard sensitive association rules, individual data values of a database must be altered. Therefore, privacy concerns must not compromise data utility. A methodology that optimally selects and alters the transactions of the database is required to balance privacy and utility. Particle swarm optimization is a meta-heuristic technique used for optimization. Hence, an approach with particle swarm intelligence is developed to select a set of database transactions for alterations to minimize the number of non-sensitive association rules that are lost and to maintain high utility of the sanitized database without compromising on privacy concerns. The projected method for hiding association rules was assessed based on some performance parameters including utility of the transformed database. Experiments have revealed that the proposed method accomplished a good balance between privacy and utility by minimizing difference between original and transformed databases.

Selecting key generating elliptic curves for Privacy Preserving Association Rule Mining (PPARM)

Data

Full-text available

Aug 2015

Privacy Preservation in Data Mining (PPDM) including for Privacy Preserving Association Rule Mining (PPARM) has attracted lots of attention in recent research and practice. However, the current method or approach still have drawbacks in the sense that there are trade-offs between efficiency and privacy preservation. This paper describes our work towards providing a new efficient PPARM protocol. We reviewed current literature on PPARM and mapped the methods or approaches involved. As previous research showed that Elliptic Curve Cryptography (ECC) perform better than the other Public Key systems such as RSA and Diffie-Hellman, we will utilize ECC for reducing the computational cost of the new PPARM protocol. In choosing good elliptic curves for ECC, we measured the running time of the key generation for various group of recommended elliptic curves i.e. Brainpool curves (by Brainpool), Prime, C2pnb, C2tnb curves (by ANSI X9.62), Secp curves (by SECG), and PrimeCurve curves (by CDC Group). As the result, Secp curves outperformed all of the other curves on overall average ratio of running time and key size of key generation by 4.4% up to 357.6%.

A novel technique of Privacy Preserving Association Rule Mining

Article

Full-text available

Dec 2016

Using blocking approach to preserve privacy in classification rules by inserting dummy Transaction

Article

Full-text available

Jan 2017

The increasing rate of data sharing among organizations could maximize the risk of leaking sensitive knowledge. Trying to solve this problem leads to increase the importance of privacy preserving within the process of data sharing. This study is focused on privacy preserving in classification rule mining as a technique of data mining. We propose a blocking algorithm to hiding sensitive classification rules. In the solution, rules' hiding occurs as a result of editing a set of transactions which satisfy sensitive classification rules. The proposed approach tries to deceive and block adversaries by inserting some dummy transactions. Finally, the solution has been evaluated and compared with other available solutions. Results show that limiting the number of attributes existing in each sensitive rule will lead to a decrease in both the number of lost rules and the production rate of ghost rules.

A novel technique of privacy preserving association rule mining

Conference Paper

Full-text available

May 2016

Privacy Preserving Association Rule Mining (PPAM) becomes an important issue in recent years. Since data mining alone is not enough to share data between companies without privacy preserving. In this paper, a new technique has been proposed to maintain the confidentiality of the data by fabricating of association rule using a stochastic standard map without returning to mining sensitive data again. The system simulation using Matlab and tested that shows the successful difference between the original data and fabricated. And also been achieved high speed and fewer memory requirements.

An Empirical Study on Preserving Sensitive Knowledge in Data Mining

Article

Jun 2018

S. Dhanalakshmi

Heuristic approach for association rule hiding using ECLAT

Conference Paper

Apr 2017

Privacy preserving heuristic approach for association rule mining in distributed database

Conference Paper

Mar 2015

Association rule mining is a powerful model of data mining used for finding hidden patterns in large databases. The challenges of data mining is to secure the confidentiality of sensitive patterns when releasing database of third parties. Privacy Preserving in this paper is used as hide association rule. Association rule hiding algorithm sanitize database such that certain sensitive association rule cannot be discovered through Association rule mining techniques. There are various approach this describe in this paper but used the Heuristic approach in Data Distortion Technique. The proposed algorithm is the extension of MDSRRC algorithm, which hides multiple R.H.S items. In Proposed work MDSRRC algorithm works on the distributed database. We will show experimental results in comparisons with MDSRRC algorithm in single database and MDSRRC algorithm in distributed database.

A Study on Association Rule Hiding Approaches

Article

Full-text available

Feb 2012

Amit Ganatra

In recent years, data mining is a popular analysis tool to extract knowledge from collection of large amount of data. One of the great challenges of data mining is finding hidden patterns without revealing sensitive information. Privacy preservation data mining (PPDM) is answer to such challenges. It is a major research area for protecting sensitive data or knowledge while data mining techniques can still be applied efficiently. Association rule hiding is one of the techniques of PPDM to protect the association rules generated by association rule mining. In this paper, we provide a survey of association rule hiding methods for privacy preservation. Various algorithms have been designed for it in recent years. In this paper, we summarize them and survey current existing techniques for association rule hiding.

Maintaining privacy and data quality in privacy preserving association rule mining

Conference Paper

Full-text available

Aug 2010

Privacy preserving data mining (PPDM) is a novel research direction to preserve privacy for sensitive knowledge from disclosure. Many of the researchers in this area have recently made effort to preserve privacy for sensitive association rules in statistical database. In this paper, we propose a heuristic algorithm named DSRRC (Decrease Support of R.H.S. item of Rule Clusters), which provides privacy for sensitive rules at certain level while ensuring data quality. Proposed algorithm clusters the sensitive association rules based on certain criteria and hides as many as possible rules at a time by modifying fewer transactions. Because of less modification in database it helps maintaining data quality.

Algorithms for Balancing Privacy and Knowledge Discovery in Association Rule Mining.

Conference Paper

Full-text available

Jan 2003

The discovery of association rules from large databases has proven beneficial for companies since such rules can be very effective in revealing actionable knowledge that leads to strategic decisions. In tandem with this benefit, association rule mining can also pose a threat to privacy protection. The main problem is that from non-sensitive information or unclassified data, one is able to infer sensitive information, including personal information, facts, or even patterns that are not supposed to be disclosed. This scenario reveals a pressing need for techniques that ensure privacy protection, while facilitating proper information accuracy and mining. In this paper, we introduce new algorithms for balancing privacy and knowledge discovery in association rule mining. We show that our algorithms require only two scans, regardless of the database size and the number of restrictive association rules that must be protected. Our performance study compares the effectiveness and scalability of the proposed algorithms and analyzes the fraction of association rules, which are preserved after sanitizing a database. We also report the main results of our performance evaluation and discuss some open research issues.

Hiding collaborative recommendation association rules

Article

Full-text available

Jun 2007

The concept of Privacy-Preserving has recently been proposed in response to the concerns of preserving personal or sensible information derived from data mining algorithms. For example, through data mining, sensible information such as private information or patterns may be inferred from non-sensible information or unclassified data. There have been two types of privacy concerning data mining. Output privacy tries to hide the mining results by minimally altering the data. Input privacy tries to manipulate the data so that the mining result is not affected or minimally affected. For output privacy in hiding association rules, current approaches require hidden rules or patterns to be given in advance [10, 18–21, 24, 27]. This selection of rules would require data mining process to be executed first. Based on the discovered rules and privacy requirements, hidden rules or patterns are then selected manually. However, for some applications, we are interested in hiding certain constrained classes of association rules such as collaborative recommendation association rules [15, 22]. To hide such rules, the pre-process of finding these hidden rules can be integrated into the hiding process as long as the recommended items are given. In this work, we propose two algorithms, DCIS (Decrease Confidence by Increase Support) and DCDS (Decrease Confidence by Decrease Support), to automatically hiding collaborative recommendation association rules without pre-mining and selection of hidden rules. Examples illustrating the proposed algorithms are given. Numerical simulations are performed to show the various effects of the algorithms. Recommendations of appropriate usage of the proposed algorithms based on the characteristics of databases are reported.

Association Rule Hiding by Positions Swapping of Support and Confidence

Article

Apr 2012

Padam Gulwani

Many strategies had been proposed in the literature to hide the information containing sensitive items. Some use distributed databases over several sites, some use data perturbation, some use clustering and some use data distortion technique. Present paper focuses on data distortion technique. Algorithms based on this technique either hide a specific rule using data alteration technique or hide the rules depending on the sensitivity of the items to be hidden. The proposed approach is based on data distortion technique where the position of the sensitive items is altered but its support is never changed. The proposed approach uses the idea of representative rules to prune the rules first and then hides the sensitive rules. Experimental results show that proposed approach hides the more number of rules in minimum number of database scans compared to existing algorithms based on the same approach i.e. data distortion technique.

Privacy-preserving data mining

Conference Paper

Jun 2000
SIGMOD REC

A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Specifically, we address the following question. Since the primary task in data mining is the development of models about aggregated data, can we develop accurate models without access to precise information in individual data records? We consider the concrete case of building a decision-tree classifier from training data in which the values of individual records have been perturbed. The resulting data records look very different from the original records and the distribution of data values is also very different from the original distribution. While it is not possible to accurately estimate original values in individual data records, we propose a novel reconstruction procedure to accurately estimate the distribution of original data values. By using these reconstructed distributions, we are able to build classifiers whose accuracy is comparable to the accuracy of classifiers built with the original data.

A Survey on Association Rule Hiding Methods

Article

Nov 2013

A Novel Algorithm for Completely Hiding Sensitive Association Rules

Conference Paper

Dec 2008

With rapid advance of the network and data mining techniques, the protection of the confidentiality of sensitive information in a database becomes a critical issue when releasing data to outside parties. Association analysis is a powerful and popular tool for discovering relationships hidden in large data sets. The relationships can be represented in a form of frequent itemsets or association rules. One rule is categorized as sensitive if its disclosure risk is above some given threshold. Privacy-preserving data mining is an important issue which can be applied to various domains, such as Web commerce, crime reconnoitering, health care, and customer's consumption analysis. The main approach to hide sensitive association rules is to reduce the support or the confidence of the rules. This is done by modifying transactions or items in the database. However, the modifications will generate side effects, i.e., nonsensitive rule falsely hidden (i.e., lost rules) and spurious rules falsely generated (i.e., new rules). There is a trade-off between sensitive rules hidden and side effects generated. In this study, we propose an efficient algorithm, FHSAR, for fast hiding sensitive association rules(SAR). The algorithm can completely hide any given SAR by scanning database only once, which significantly reduces the execution time. Experimental results show that FHSAR outperforms previous works in terms of execution time required and side effects generated in most cases.

Efficient sanitization of informative association rules

Article

Jul 2008
EXPERT SYST APPL

Recent development in privacy-preserving data mining has proposed many efficient and practical techniques for hiding sensitive patterns or information from been discovered by data mining algorithms. In hiding association rules, current approaches require hidden rules or patterns to be given in advance. In addition, for Apriori algorithm based techniques [Verykios, V., Elmagarmid, A., Bertino, E., Saygin, Y., & Dasseni, E. (2004). Association rules hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4) 434–447], multiple scanning of the entire database is required. For direct sanitization of itemsets from transaction techniques [Oliveira, S., & Zaiane, O. (2003). An efficient on-scan sanitization for improving the balance between privacy and knowledge discovery. Technical report TR 03-15, Department of Computing Science, University of Alberta, Canada], one scanning of each window in the database is processed independently. However, the accumulated information among windows is not considered. In this work, we propose an efficient one database scanning sanitization algorithm to sanitize informative association rules. For a given predicting item, an informative association rule set [Li, Jiuyong, Shen, Hong, & Topor, Rodney. (2001). Mining the smallest association rule set for predictions, In Proceedings of the 2001 IEEE international conference on data mining (pp. 361–368)] is the smallest association rule set that makes the same prediction as the entire association rule set by confidence priority. A new data structure called pattern-inversion tree is proposed to store related information so that only one scan of database is required. The pre-process of finding these informative association rules can be integrated into the sanitization process. Numerical experiments show that the performance of the proposed algorithm is more efficient than previous algorithms with similar side effects. Running time complexity of the algorithm is presented and compared to similar algorithm with better complexity.

Hiding informative association rule sets

Article

Aug 2007
EXPERT SYST APPL

Privacy-preserving data mining, is a novel research direction in data mining and statistical databases, where data mining algorithms are analyzed for the side effects they incur in data privacy [Verykios, V., Bertino, E., Fovino, I. G., Provenza, L. P., Saygin, Y., & Theodoridis, Y. (2004). State-of-the-art in privacy preserving data mining. SIGMOD Record 33(1), 50–57, March 2004]. For example, through data mining, one is able to infer sensitive information, including personal information or even patterns, from non-sensitive information or unclassified data. There have been two types of privacy concerning data mining. The first type of privacy, called output privacy, is that the data is minimally altered so that the mining result will not disclose certain privacy. The second type of privacy, called input privacy, is that the data is manipulated so that the mining result is not affected or minimally affected.

A Survey of Association Rule Hiding Algorithms

Abstract

Recommended publications

Association Rule Hiding Techniques for Privacy Preserving Data Mining: A Study

Association rule hiding based on evolutionary multi-objective optimization

Association Rule Hiding for Data Mining

Detection of sensitive items in market basket database using association rule mining for privacy pre...