Enhancing blockchain transaction classification with ensemble learning approaches

Amrutanshu Panigrahi; Abhilash Pati; Bibhuprasad Sahu; Rourab Paul; Ajit Kumar Nayak; Subrata Chowdhury; Ramya Govindaraj; J Shreyas

doi:10.1038/s41598-025-04072-7

Enhancing blockchain transaction classification with ensemble learning approaches

Sci Rep. 2025 Jul 1;15(1):22068. doi: 10.1038/s41598-025-04072-7.

Authors

Amrutanshu Panigrahi¹, Abhilash Pati¹, Bibhuprasad Sahu², Rourab Paul³, Ajit Kumar Nayak⁴, Subrata Chowdhury⁵, Ramya Govindaraj⁶, J Shreyas⁷

Affiliations

¹ Department of CSE, Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India.
² Department of Information Technology, Vardhaman College of Engineering (Autonomous), Hyderabad, Telangana, India.
³ Department of Computer Science and Engineering, Shiv Nadar University, Chennai, Tamil Nadu, India.
⁴ Department of CS&IT, Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India.
⁵ Department of Computer Science and Engineering, Sri Venkateswara College of Engineering and Technology (Autonomous), Chittoor, AP, India.
⁶ School of Computer Science Engineering Systems, Vellore Institute of Technology, Vellore, India.
⁷ Department of Information Technology, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, India. Shreyas.j@manipal.edu.

Abstract

Since the emergence of Blockchain as Bitcoin, its development has progressed rapidly and attracted the attention of various researchers in academia and industry. Blockchain technology is becoming an increasingly secure and effective way to share information in various industries, including finance, supply chain management (SCM), and the Internet of Things (IoT). The increase in the number of Blockchain users demands malicious and non-malicious transactions to maintain the trust in Blockchain. This research aims to develop a machine learning (ML) based model for classifying blockchain transactions into risky or non-risky ones. The model comprises four feature selection approaches, including Correlation-based Feature Selection (CFS), Recursive Feature Elimination (RFE), Random Forest (RF), and Information Gain (IG). Then, two ensemble feature selection methods, known as rank averaging and rank aggregation, are applied to combine the features selected from the initial feature selection methods. Various ML classification algorithms are applied to the selected features from two ensemble feature selection algorithms as the base learners to make initial predictions. Finally, three different ensemble base classifiers, including hard voting, soft voting, and weighted averaging, are applied to these initial predictions to make the final prediction. Three blockchain transactional datasets are considered for evaluating the proposed ensemble-based model. The empirical analysis of the reported work shows that the maximum accuracy obtained using the Rank Averaging ensemble feature selection technique is 99.24%, whereas the maximum accuracy using the Rank Aggregation ensemble feature technique is 98.73%.

Keywords: Blockchain; Ensemble feature selection; Machine learning; Rank aggregation; Rank averaging.