Transactional dataset for a toy example of a market basket analysis including customers' ID

Source publication

FIGURE 3. Variation of the runtime when the search space increases (the...

FIGURE 4. Variation of the runtime when the number of instances...

FIGURE 5. Variation of the runtime when the search space increases.

FIGURE 6. Variation of the runtime when the number of bags (groups of...

FIGURE 7. Variation of the runtime when the density (average number of...

Extracting User-Centric Knowledge on Two Different Spaces: Concepts and Records

Article

Full-text available

Jul 2020

The growing demand for eliciting useful knowledge from data calls for techniques that can discover insights (in the form of patterns) that users need. Methodologies for describing intrinsic and relevant properties of data through the extraction of useful patterns, however, work on fixed input data, and the data representation, therefore, constrains...

Context 1

... the relative support is denoted as support r (P ) = support(P )/|Ω|. As a matter of clarification, let us consider the toy example for a market basket dataset where the customers' ID and the season in which the purchases were carried out are considered (see Table 4). Considering multiple concepts, it is obtained that the pattern P = {Pampers}(2) is satisfied for two customers (ID #1 and #2) in any of the seasons. ...

View in full-text

Context 2

... concepts used to organize data records highly depends on the users (and their expectations), and the percentage of records satisfied per sub-bag is also a pre-requisite that can be modified by the users. As a matter of example, let us consider now the same toy example organized by customers and seasons (see Table 4). Additionally, let us consider that a bag B j is satisfied if and only if most of its sub-bags are also satisfied (≥50% of the sub-bags include at least one transaction that satisfies the pattern). ...

View in full-text

Fig. 10 Comparison of KITTI, nuScenes, and Waymo Open Dataset. We focus...

Summary of multi-modal 3D detection methods: loc (fusion location),...

Advantages and disadvantages for feature fusion and decision fusion

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

Preprint

Full-text available

Jun 2021

In the past few years, we have witnessed rapid development of autonomous driving. However, achieving full autonomy remains a daunting task due to the complex and dynamic driving environment. As a result, self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception. As the number and type of sensors ke...

Efficient Frequent Chronicle Mining Algorithms: Application to Sleep Disorder

Article

Full-text available

Jan 2024

Sequential pattern mining is a dynamic and thriving research field that aims to extract recurring sequences of events from complex datasets. Traditionally, focusing solely on the order of events often falls short of providing precise insights. Consequently, incorporating the temporal intervals between events has emerged as a vital necessity across various domains, e.g. medicine. Analyzing temporal event sequences within patients’ clinical histories, drug prescriptions, and monitoring alarms exemplifies this critical need. This paper presents innovative and efficient methodologies for mining frequent chronicles from temporal data. The mined graphs offer a significantly more expressive representation than mere event sequences, capturing intricate details of a series of events in a factual manner. The experimental stage includes a series of analyses of diverse databases with distinct characteristics. The proposed approaches were also applied to real-world data comprising information about subjects suffering from sleep disorders. Alluring frequent complete event graphs were obtained on patients who were under the effect of sleep medication.

An Efficient Method for Mining Rare Association Rules: A Case Study on Air Pollution

Article

Full-text available

Jun 2021
INT J ARTIF INTELL T

Most pattern mining techniques almost singularly focus on identifying frequent patterns and very less attention has been paid to the generation of rare patterns. However, in several domains, recognizing less frequent but strongly related patterns have greater advantage over the former ones. Identification of compelling and meaningful rare associations among such patterns may proved to be significant for air quality management that has become an indispensable task in today’s world. The rare correlations between air pollutants and other parameters may aid in restricting the air pollution to a manageable level. To this end, efficient and competent rare pattern mining techniques are needed that can generate the complete set of rare patterns, further identifying significant rare association rules among them. Moreover, a notable issue with databases is their continuous update over time due to the addition of new records. The users requirement or behavior may change with the incremental update of databases that makes it difficult to determine a suitable support threshold for the extraction of interesting rare association rules. This paper, presents an efficient rare pattern mining technique to capture the complete set of rare patterns from a real environmental dataset. The proposed approach does not restart the entire mining process upon threshold update and generates the complete set of rare association rules in a single database scan. It can effectively perform incremental mining and also provides flexibility to the user to regulate the value of support threshold for generating the rare patterns. Significant rare association rules representing correlations between air pollutants and other environmental parameters are further extracted from the generated rare patterns to identify the substantial causes of air pollution. Performance analysis shows that the proposed method is more efficient than existing rare pattern mining approaches in providing significant directions to the domain experts for air pollution monitoring.

Identification of Changes in VLE Stakeholders’ Behavior Over Time Using Frequent Patterns Mining

Article

Full-text available

Feb 2021

Many contemporary studies realized in the Learning Analytics research field provide substantial insights into the virtual learning environment stakeholders’ behaviour on single-course or small-scale level. They used different knowledge discovery techniques, including frequent patterns analysis. However, there are only a few studies that have explored the stakeholders’ behaviour over a more extended period of several academic years in detail. This article contributes to filling in this gap and provides a novel approach to using homogeneous groups of frequent patterns for identifying the changes in stakeholders’ behaviour from the perspective of time. The novelty of this approach lies in fact, that even though the time variable is not directly involved, identification of homogeneous groups of frequent itemsets allows analysis and comparison of the stakeholders’ behavioural patterns and their changes over different observed periods. Found homogeneous groups of frequent itemsets, which conform minimal threshold of selected measures, showed, that it is possible to uncover the changes in stakeholders’ behaviour throughout the observed longer period. As a result, these homogenous groups of found frequent patterns allow a better understanding of the hidden changes in seasonality or trends in stakeholders’ behaviour over several academic years. This article discusses the possible implications of the results and proposed approach in the context of virtual learning environment management and educational content improvement.

Exceptional in So Many Ways — Discovering Descriptors that Display Exceptional Behaviour on Contrasting Scenarios

Article

Full-text available

Nov 2020

The current state of the art in supervised descriptive pattern mining is very good in automatically finding subsets of the dataset at hand that are exceptional in some sense. The most common form, subgroup discovery, generally finds subgroups where a single target variable has an unusual distribution. Exceptional model mining (EMM) typically finds subgroups where a pair of target variables display an unusual interaction. What these methods have in common is that one specific exceptionality is enough to flag up a subgroup as exceptional. This, however, naturally leads to the question: can we also find multiple instances of exceptional behavior simultaneously in the same subgroup? This paper provides a first, affirmative answer to that question in the form of the SPEC (Subsets of Pairwise Exceptional Correlations) model class for EMM. Given a set of predefined numeric target variables, SPEC will flag up subgroups as interesting if multiple target pairs display an unusual rank correlation. This is a fundamental extension of the EMM toolbox, which comes with additional algorithmic challenges. To address these challenges, we provide a series of algorithmic solutions whose strengths/flaws are empirically analyzed.

Data Heterogeneity's Impact on the Performance of Frequent Itemset Mining Algorithms

Article

Jun 2024
INFORM SCIENCES

Introduction to Data Mining

Chapter

Oct 2021

Jose M. Luna

This chapter introduces data mining, also known as knowledge discovery from data, as a process of discovering useful, interesting and previously unknown patterns from data. Some techniques and domains related to data mining are described, explaining their similarities and differences. Some data types are then analysed since data on multiple data inputs might be considered due to the natural evolution of information technology. Data processing approaches are also described, stating how to transform raw data into a readable and useful form and presenting different data representations. Finally, general data mining techniques are outlined. Mining frequent patterns and associations; predictive analysis; supervised descriptive analysis; cluster analysis; and outliers analysis, to list a few.

Discovering Frequent Patterns in Very Large Transactional Databases

Chapter

Oct 2021

Jose M. Luna

Finding frequent patterns in very large transactional databases is a challenging problem of great concern in many real-world applications. In this chapter, we first introduce the model of frequent patterns. Second, we describe the search space for finding the desired patterns. Third, we present four popular algorithms to find the patterns. Finally, we present the extensions of frequent patterns.

Transactional dataset for a toy example of a market basket analysis including customers' ID

Contexts in source publication

Similar publications

Citations