The effectiveness of the Random Forest algorithm in monitoring abnormal withdrawals to detect credit cards frauds
الكلمات المفتاحية:Credit Card; Fraud Detection; transactions; Random Forest; Logistic Regression; Gradient Boosted; Support Vector Machine; Fuzzy Rule; Multi Layers Perceptron; Decision Tree.
Today, the reach of the Internet and the broad range of options such as e-commerce, online shopping, have gained a great deal of attention. On the other side of the coin, customers are faced with negative benefits due to fraudulent activities. Credit card fraud refers to the physical loss of credit card or loss of sensitive credit card information. The Credit Card Fraud Transaction Detection System is a method used to identify fraudulent transactions that take place every once in a while. Classification techniques are most commonly used for the analysis of predictions. Prediction of the detection of credit card fraud is therefore the main objective of this work. Authors seek to implement the latest data mining techniques dubbed "machine learning techniques", which allows owners and service providers to identify fraud in the credit card and realize whether the purchase is fraudulent or legitimate. Our aim here is to identify fraudulent transactions while eliminating incorrect classifications of fraud. The project consists mainly of four major algorithms and uses anomaly detection as a method to classify fraudulent transactions. In this paper, we have proposed the use of seven classifiers to detect fraudulent credit card transactions. This choice was made by evaluating different methods, including Random Forest, Logistic Regression, Gradient Boosted, Multi Layers Perceptron, Support Vector Machine, Decision Tree and Fuzzy Rule. In this paper, we worked with European credit card fraud dataset. Test and training sets are the two sub-parts of the input data. In terms of precision, recall and F-measurement, the normal and fraud transactions have been predicted on the basis of test and training sets. The performance of the algorithms is measured based on recall, precision and f-measure .Compared to proposed algorithms with two feature selections, we suggest that the Random Forest is best algorithm, and more effective in F-measurement with 86% degree. And genetic algorithms are the best technique for selecting a feature from a dataset, and the Random Forest algorithm outperformed the other proposed algorithms. From the results obtained, it is clear that Random Forest obtained average Recall scores 86.15% and 84.87% for Genetic Algorithms and Feature Elimination for Features Selection respectively, among all algorithms, indicating the ability of Random Forest correctly detect more than 86% of the suspicious credit card transactions with a low false-negative percentage in Genetic Algorithms and more than 84% in Feature Elimination for Features Selection.