Figure - available from: Multimedia Tools and Applications
This content is subject to copyright. Terms and conditions apply.
Proposed credit card fraud detection Framework

Proposed credit card fraud detection Framework

Source publication
Article
Full-text available
Credit card fraud has adversely impacted market economic order and has broken stakeholders, financial entities, and consumers’ trust and interest. Card fraud losses are increasing annually and billions of dollars are being lost. Therefore, this work provides a framework for fraud card detection to be tackled efficiently. Recently, the imbalanced da...

Similar publications

Article
Full-text available
Purpose – propose an experimental way to create ML solutions to the problem of detecting credit card fraud. Method or methodology of the work: the article uses machine learning (ML) and data mining methods Results: the paper showed that machine learning (ML) and data mining techniques are effective in improving fraud detection accuracy. The study...
Article
Full-text available
With the increased use of information technology, many financial services are available to users at their fingertips. However, this led to many fraud transactions. Automatic fraud identification and detection could improve the user experience and security of online transactions. Using machine learning algorithms, it is possible to detect fraud tran...
Article
Full-text available
Using the wrong metrics to gauge classification of highly imbalanced Big Data may hide important information in experimental results. However, we find that analysis of metrics for performance evaluation and what they can hide or reveal is rarely covered in related works. Therefore, we address that gap by analyzing multiple popular performance metri...

Citations

... Comparatively speaking, there are much less sample instances in the minority class than in the majority class. Accordingly, severe skews in the distribution of classes and inadequate rendition of specific data are persistent challenges in many domains as medicine [17], predicting defects for software [18],and in financial services [19].Hence, [20] stated that the performance of conventional classifiers may suffer when there is an imbalanced distribution of classes in a dataset. ...
Article
Full-text available
Because of the advancements in technology, classification learning has become an essential activity in today's environment. Unfortunately, through the classification process, we noticed that the classifiers are unable to deal with the imbalanced data, which indicates there are many more instances (majority instances) in one class than in another. Identifying an appropriate classifier among the various candidates is a time-consuming and complex effort. Improper selection can hinder the classification model's ability to provide the right outcomes. Also, this operation requires preference among a set of alternatives by a set of criteria. Hence, multi-criteria decision-making (MCDM) methodology is the appropriate methodology can deploy in this problem. Accordingly, we applied MCDM and supported it through harnessing neurotrophic theory as motivators in uncertainty circumstances. Single value Neutrosophic sets (SVNSs) are applied as branch of Neutrosophic theory for evaluating and ranks classifiers and allows experts to select the best classifier So, to select the best classifier (alternative), we use MCDM method called Multi-Attributive Ideal-Real Comparative Analysis (MAIRAC) and the criteria weight calculation method called Stepwise Weight Assessment Ratio Analysis (SWARA) where these methods consider single-value neutrosophic sets (SVNSs) to improve and boost these techniques in uncertain scenarios. All these methods are applied after modeling criteria and its sub-criteria through a novel technique is Tree Soft Sets (TrSS). Ultimately, the findings of leveraging these techniques indicated that the hybrid multi-criteria meta-learner (HML)-based classifier is the best classifier compared to the other compared models.
... In response, machine learning has emerged as a pivotal force, offering the capability to learn from data patterns and adapt to evolving fraudulent strategies. The synergy of rule engines and machine learning presents a comprehensive approach, combining domain expertise with the adaptability of data-driven models [7]. ...
... It is demonstrated in eqn. (7). Fig. 2 depicts the ROC curves for different fraud detection methods, namely LSTM, Decision Tree, CNN, and the proposed CNN-RNN hybrid model. ...
Conference Paper
-Financial fraud detection is a critical domain in which the continuous evolution of fraudulent tactics necessitates advanced and adaptive detection mechanisms. This paper addresses the challenges of fraud detection within bank payments by proposing a novel approach that integrates Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). Leveraging the synthetically generated Bank sim dataset from Kaggle, the study employs intelligent systems such as rule engines and machine learning to combat fraudulent activities. While rule engines execute predefined business rules, machine learning, implemented using Pythonsoftware, offers a precise approach by learning evolving fraud patterns. The proposed model's methodology involves preprocessing the dataset using Min-Max Normalization to scalenumerical features, ensuring equitable contributions during model training. Feature extraction employs CNNs to capture hierarchical patterns in unstructured textual data, followed by sequential modelling with RNNs to understand temporal dependencies in transaction logs. The proposed CNN-RNN hybrid model achieves remarkable accuracy, with a report edaccuracy of 99.2%, surpassing existing methods by 4%. The integration of spatial and temporal modelling techniques enhances the model's precision and reliability in detecting fraudulent transactions. The ROC analysis further substantiates the proposed CNN-RNN hybrid model's discriminative power, with an impressive ROC value of 0.9.Keywords--Financial Fraud Detection, Convolutional Neural Networks, Recurrent Neural Networks, Machine Learning; Data Banking
... Abd El-Naby et al. [17] offered a framework that combined sampling and oversampling to solve the problem of unbalanced data in fraud detection. This framework greatly improved the model's capacity to detect fraudulent transactions. ...
... Due to the capacity of machine learning algorithms to evaluate high dimensional data and recognize patterns and anomalies, the study [22] utilizes machine learning algorithms that comprise K-Nearest Neighbor (KNN), Logistics ...
Preprint
Full-text available
New bank account fraud is a significant problem causing financial losses in banking and finance. Existing statistical and machine-learning methods were used to detect fraud thereby preventing financial losses. However, most studies do not consider the dynamic behavior of fraudsters and often produce a high False Positive Rate (FPR). This study proposes the detection of new bank account fraud in the context of simultaneous game theory (SGT) with Neural Networks, the SGT involves two players, a fraudster, and bank officials attacking each other through Bayesian probability in a zero-sum. The influence of outliers within the SGT was tackled by adding a context feature for effective simulation of the dynamic behavior of fraudsters. The Neural Networks layer uses the simulated features for fraud context learning. The study is validated using Bank Account Fraud (BAF) Dataset on different machine-learning models. The Radial Basis Function Networks achieved FPR of 0.0% and 8.3% for fraud and non-fraud classes, respectively, while achieving True Positive Rate (TPR) of 91.7% and 100.0% for fraud and non-fraud classes, respectively. An improved Radial Basis Function Networks detect fraud by revealing fraudulent patterns and dynamic behaviors in higher dimensional data. The findings will enhance fraud detection and reduce customer attrition.
... The CNN model enables effective processing for spontaneously getting features and interpretations of structured data from soil images. In recent years, much research work and studies by researchers around the world have been accomplished on the potential of image recognition by artificial intelligence, including machine and deep learning approaches and analysis logs in cloud systems [7][8][9][10][11][12]. For the objective of expanding the cases of soil classification, AI models were created that were conscious of the practical simplicity of the images used. ...
Article
Full-text available
Soil color recognition through AI approaches is crucial for efficient and rapid soil analysis in geotechnical engineering. It enables improved site characterization and supports informed decision-making in engineering projects. The integration of soil color data with other geospatial information enhances geological mapping and modeling capabilities. The expansion of deep learning can assist different stockholders in soil classification as a significant study in recent years. Therefore, this paper presented an efficient IoT-based hybrid CNN-SVM model to classify five soil classes through soil images. The recommended framework utilizes a hybrid model based on CNN as feature selection. After that, a multiclass SVM as a classifier is utilized for the soil classification task in an effective manner as well as new real-time IoT-based portable soil detection devices for geotechnical engineers in Geo-sites. The proposed framework is estimated using a dataset that comprehends a total of 252 soil images for investigation purposes with different evaluation metrics. The proposed hybrid framework using CNN models such as Squeezenet, Alexnet, and Resnet50 with the multiclass SVM classifier gives good accuracy of 86%, 96%, and 95%, respectively. In contrast, the CNN models only give accuracy of 80%, 89%, and 87% for the Squeezenet, Alexnet, and Resnet50, respectively. From the obtained results, we observed that the offered hybrid Alexnet-SVM gives a higher performance while the Squeezenet-SVM gives the lowest concert. Overall, the attained results showed that the recommended hybrid framework gives the best concert for soil classification, which can help in making an efficient support decision-making system for real-time geotechnical engineering applications.
... Tomek links can be used as undersampling or data cleaning methods. The idea here was to use Tomek links to undersample the majority class by removing Tomek links, although samples from both the majority and minority classes were eliminated rather than only those from the majority class that form Tomek links [22]. ...
Article
Full-text available
Recent developments in the use of credit cards for a range of daily life activities have increased credit card fraud and caused huge financial losses for individuals and financial institutions. Most credit card frauds are conducted online through illegal payment authorizations by data breaches, phishing, or scams. Many solutions have been suggested for this issue, but they all face the major challenge of building an effective detection model using highly imbalanced class data. Most sampling techniques used for class imbalance have limitations, such as overlapping and overfitting, which cause inaccurate learning and are slowed down by noisy features. Herein, a hybrid Tomek links BIRCH clustering borderline SMOTE (BCB-SMOTE) sampling method is proposed to balance a highly skewed credit card transaction dataset. First, Tomek links were used to undersample majority instances and remove noise, and then BIRCH clustering was applied to cluster the data and oversample minority instances using B-SMOTE. The credit card fraud-detection model was run using a random forest (RF) classifier. The proposed method achieved a higher F1-score (85.20%) than the baseline sampling techniques tested for comparison. Because of the enormous number of credit card transactions, there was still a small false-positive rate. The proposed method improves the detection performance owing to the well-organized balancing of the dataset.
... Through the application of supervised machine learning algorithms, this study demonstrates the effectiveness of using a combination of feature engineering, class balancing techniques, and cross-validation to detect credit card fraud with high accuracy [1]. By analyzing and comparing the performance of various algorithms, this paper emphasizes the significance of selecting the appropriate algorithm and fine-tuning its parameters to attain optimal results. ...
Article
Full-text available
The issue of credit card fraud poses a significant concern for users of online transactions, necessitating the implementation of effective fraud detection mechanisms. Fraud detection is often done using machine learning algorithms. Practitioners can compare and analyse algorithms to get the best one for their credit card fraud detection scenario. This article describes a detailed study to find the best credit card fraud forecasting model. The study tests cutting-edge supervised machine learning methods on two datasets. Using eight algorithms improves credit card fraud detection accuracy and efficacy. Logistic Regression, Decision Trees, Random Forests, Multilayer Perceptions, Naive Bayes, XGBoost, KNN, and SVM are examples (SVM). Additionally, Principal Component Analysis (PCA) is used to reduce dimensionality and improve algorithm performance during experimentation. XGBoost has the maximum accuracy of 99.96 percent for the first dataset, while Random Forest has 99.92 percent for the second. Cross-validation with Logistic Regression, Decision Trees, Random Forests, and XGBoost proves Random Forests are better at credit card fraud detection. Random Forests excel at undersampling and oversampling. Thus, this paper proposes XGBoost and Random Forests as the most reliable credit card fraud detection algorithms.
... Within two days, 284,807 transactions were recorded, among which 492 were classified as frauds. This renders the dataset heavily imbalanced as the frauds constitute only 0.172% of all the transactions [1]. The dataset consists exclusively of numerical input variables, which are the outcome of a PCA transformation. ...
... However, ANN can be computationally expensive and requires careful selection of architecture and hyperparameters to achieve optimal performance. AdaBoost, short for Adaptive Boosting, is a machine learning algorithm primarily designed for binary classification tasks [1]. In AdaBoost, every instance in the training dataset is assigned an initial weight. ...
Article
Full-text available
Credit card fraud has increased vulnerability effects due to the large usage functions for customers due to innovative technologies and communication patterns. This article presents a review and important analysis of credit card fraud detection and prediction of fraudulent transactions based on cutting-edge research. The study provides a limited investigation into deep machine learning to address the effects of data issues on credit card fraud detection through the design of robust solutions. This study aims to develop a mechanism with classifiers, such as Artificial Neural Network (ANN), Support Vector Machine (SVM), and Naive Byes, that contain vectors of information sequence properties, structure and mechanisms. Simultaneously, diverse experiments are developed to analyze the proposed approach to datasets. The framework enhances a comprehensive financial security diverse approach to suspicious financial activities on stakeholders' assets. It emphasizes the significance of mitigation and detection capabilities for potential threats to safeguard financial transactions. The outcome of this research demonstrates a robust solution for real scenarios of credit card fraud detection, considering model abilities with high accuracy rates that address the limitations of integrated factors.
... As part of this study, we considered the problem of imbalanced datasets, where traditional binary or multi-class classification results in a bias toward the classes with a high number of instances. It has been shown that biased distributions can lead to problems in various domains including credit financial fraud detection [5], software defect prediction, network intrusion detection [6], pattern recognition [7], and medical diagnosis [8]. In the aforementioned domains, it is important to correctly classify minority instances otherwise the imbalance class distribution can present a number of challenges, including a biased and poor generalization, which can negatively impact the effectiveness and performance of machine learning algorithms, particularly in real-world settings. ...
Article
Full-text available
Deep learning has played an important role in many real-life applications, especially in image classification. It is often found that some domain data are highly skewed, i.e., most of the data belongs to a handful of majority classes, and the minority classes only contain small amounts of information. It is important to acknowledge that skewed class distribution poses a significant challenge to machine learning algorithms. Due to which in case of imbalanced data distribution, the majority of machine and deep learning algorithms are not effective or may fail when it is highly imbalanced. In this study, a comprehensive analysis in case of imbalanced dataset is performed by considering deep learning based well known models. In particular, the best feature extractor model is identified and the current trend of latest feature extraction model is investigated. Moreover, to determine the global scientific research on the image classification of imbalanced mushroom dataset, a bibliometric analysis is conducted from 1991 to 2022. In summary, our findings may offer researchers a quick benchmarking reference and alternative approach to assessing trends in imbalanced data distributions in image classification research.
... To efficiently address the issue of fraudulent card usage, [32] propose a methodology for its mitigation. The incidents of credit card scams have had a significant impact on the established economic structure of the market, leading to a disruption in its functioning. ...
Article
Full-text available
The increasing dependence on data analytics and artificial intelligence (AI) methodologies across various domains has prompted the emergence of apprehensions over data security and integrity. There exists a consensus among scholars and experts that the identification and mitigation of Multi-step attacks pose significant challenges due to the intricate nature of the diverse approaches utilized. This study aims to address the issue of imbalanced datasets within the domain of Multi-step attack detection. To achieve this objective, the research explores three distinct re-sampling strategies, namely over-sampling, under-sampling, and hybrid re-sampling techniques. The study offers a comprehensive assessment of several re-sampling techniques utilized in the detection of Multi-step attacks on deep learning (DL) models. The efficacy of the solution is evaluated using a Multi-step cyber attack dataset that emulates attacks across six attack classes. Furthermore, the performance of several re-sampling approaches with numerous traditional machine learning (ML) and deep learning (DL) models are compared, based on performance metrics such as accuracy, precision, recall, F-1 score, and G-mean. In contrast to preliminary studies, the research focuses on Multi-step attack detection. The results indicate that the combination of Convolutional Neural Networks (CNN) with Deep Belief Networks (DBN), Long Short-Term Memory (LSTM), and Recurrent Neural Networks (RNN) provides optimal results as compared to standalone ML/DL models. Moreover, the results also depict that SMOTEENN, a hybrid re-sampling technique, demonstrates superior effectiveness in enhancing detection performance across various models and evaluation metrics. The findings indicate the significance of appropriate re-sampling techniques to improve the efficacy of Multi-step attack detection on DL models.