Search for collections on Universitas Islam Negeri Sultan Syarif Kasim Riau Repository

CLASSIFICATION 0F PHISHING URL ATTACKS USING RANDOM FOREST ALGORITHM BASED ON FEATURE IMPORTANCE

MELYANA HASIBUAN, - (2026) CLASSIFICATION 0F PHISHING URL ATTACKS USING RANDOM FOREST ALGORITHM BASED ON FEATURE IMPORTANCE. Bit-Tech (Binary Digital - Technology), 8 (2). pp. 2984-2993. ISSN Print ISSN 2622-271X Online ISSN 2622-2728

[img]
Preview
Text (ARTIKEL)
Repositori melyana - MELYANA HASIBUAN Teknik Informatika.pdf - Published Version

Download (3MB) | Preview
[img]
Preview
Text (SURAT PERNYATAAN PUBLIKASI)
Surat publikasi sendiri - MELYANA HASIBUAN Teknik Informatika.pdf - Published Version

Download (867kB) | Preview

Abstract

The development of information technology and increasing digital activities have made URL-based phishing threats more complex and difficult to detect. Phishing attacks target not only individuals but also organizations, requiring detection systems that are accurate, efficient, and capable of handling highdimensional data. Machine learning approaches, particularly Random Forest, have been widely applied for phishing detection; however, further evaluation is needed regarding the role of feature selection in improving efficiency without reducing performance. This study aims to evaluate the performance of the Random Forest algorithm for phishing URL detection and to analyze the impact of feature selection based on feature importance. This research adopts the Knowledge Discovery in Databases (KDD) framework, including data selection, preprocessing, feature selection, modeling, and evaluation stages. The PhiUSIIL-2024 dataset is used, with two modeling scenarios: Random Forest using all features (RF Full) and Random Forest using the top 30 features selected through feature importance (RF Top-30). Model performance is evaluated using accuracy, precision, recall, and F1-score metrics under different data split ratios. The experimental results show that both models achieve very high and stable classification performance, with evaluation metrics close to or reaching 100%. The RF Top-30 model maintains performance comparable to the RF Full model despite using fewer features. This study concludes that feature importance-based feature selection effectively simplifies the Random Forest model without sacrificing performance, making it suitable for efficient URL phishing detection systems.

Item Type: Article
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorRAHMAD ABDILLAH, -2030088701rahmad.abdillah@uin-suska.ac.id
Subjects: 000 Karya Umum
Divisions: Fakultas Sains dan Teknologi > Teknik Informatika
Depositing User: Mr Eko Syahputra
Date Deposited: 14 Jan 2026 00:52
Last Modified: 14 Jan 2026 00:52
URI: http://repository.uin-suska.ac.id/id/eprint/92202

Actions (login required)

View Item View Item