Authors:Bayu Nur Pambudi, Silmi Fauziati, Indriana Hidayah Pages: 1 - 9 Abstract: The main challenge of data mining approaches to detect fraud in financial transaction data is the imbalance of data classes in available datasets, with a much smaller fraud class proportion than the non-fraud. This imbalance affects the f1-score to be low due to unbalanced precision and recall. Therefore, the model can predict one class well, but it does not apply to another class. In addition, the lengthy training time duration and high computational resource requirements in implementing data mining also make them a particular concern. Therefore, solely handling imbalanced data is still insufficient to produce the expected performance. Reduction of data dimensions can be a solution to increase the speed of the process. However, this method actually reduces the classifier’s performance when it comes to classification. Furthermore, this study intends to improve the performance of the data mining approach based on the Support Vector Machine (SVM) classifier aiming at detecting financial fraud transactions. The SVM performance was refined by tuning the kernel and hyperparameter integrated with the Random Under Sampling (RUS) and our Minimum error-based Principal Component Analysis (MebPCA). The RUS was used to handle imbalanced data, while MebPCA modified data dimension reduction techniques based on classification errors to speed up computational time without disturbing the performance of SVM. This combination improves the classifier's performance in detecting fraud effectively with a precision improvement of 29.31% and f1-score of 19.8%, and efficiently reduces the duration of training time significantly by 36.39% compared to previous research regarding the SVM method for fraud detection. PubDate: 2022-06-27 DOI: 10.15294/jte.v14i1.35787 Issue No:Vol. 14, No. 1 (2022)
Authors:I Kadek Dendy Senapartha, Gabriel Indra Widi Tamtama Pages: 10 - 17 Abstract: Face anti-spoof systems are needed in facial recognition systems to ward off attacks that present fake faces in front of the camera or image capture sensor (presentation attack). To build the system, a data set is needed to build a classification model that distinguishes the authenticity of the face of the input image received by the system. In the past decade anti-face spoof research has produced many data sets that are public, but often researchers need time to build or use the right public data sets that are used to build facial anti-spoof models. This article conducts a literature study of public data sets using a systematic literature review method to find out the types of attacks that appear on the facial anti-spoof system, the development process, evolution, and availability of facial anti-spoof data sets. From the search and selection results based on the specified criteria, there were 42 primary research manuscripts in the period 2010 to 2021. The results of the literature study found that there were three trends in the development of anti-spoof facial data sets, namely, 1) data sets with a very large number, 2) datasets with different types of facial samples, and 3) datasets constructed with various devices and sensors. These various public data sets can be accessed freely but with special rules such as agreeing to an end user license agreement document from the researcher or the institution that owns the data set. However, there are also datasets that cannot be accessed due to invalid URLs or due to special rules from the cloud storage service provider where the datasets are stored. PubDate: 2022-06-27 DOI: 10.15294/jte.v14i1.36108 Issue No:Vol. 14, No. 1 (2022)