A comparative Analysis on Stacked Hybrid Intelligence: A Multi-Paradigm Machine Learning Framework for Robust Phishing URL Detection

Magdah Othman  Mohammed Osman; Llahm Omar Faraj Ben Dalla; Tarik A. Rashid; Magda Juma Shuayb Albaraesi; Mohamed Ali Mohamed EL-sseid; Abdussalam Ali Ahmed; Yasser Fathi Nassar; Abdulgader Alsharif

doi:10.65405/gcfgsj10

المؤلفون

Magdah Othman Mohammed Osman Systems analysis and programming Department, Higher Institute of Science and Technology, Ajdabiya , Libya المؤلف
Llahm Omar Faraj Ben Dalla Computer Engineering, College of Technical Science, Sebha, Libya المؤلف
Tarik A. Rashid Artificial Intelligence and Innovation Centre University of Kurdistan Hewler, Erbil, Iraq المؤلف
Magda Juma Shuayb Albaraesi Business administration Department , Kambut Higher Institute for Administrative and Financial Sciences, Tobruk, Libya المؤلف
Mohamed Ali Mohamed EL-sseid Department of Software Engineering, Ankara Bilim University, Türkiye المؤلف
Abdussalam Ali Ahmed Mechanical and Industrial Engineering Department, Bani Waleed University, Libya المؤلف
Yasser Fathi Nassar Mechanical and Renewable Energy Engineering Dept., Faculty of Engineering, Wadi Alshatti University, Brack, Libya المؤلف
Abdulgader Alsharif Department of Electric and Electronic Engineering, College of Technical Sciences Sebha, Libya المؤلف

DOI:

https://doi.org/10.65405/gcfgsj10

الكلمات المفتاحية:

Phishing URL detection; Hybrid ensemble learning; PhiUSIIL dataset; Character-level CNN; Bi-LSTM; Stacked generalization; Cybersecurity

الملخص

يتطلب التطور المتزايد لحملات التصيد الاحتيالي أطر عمل للكشف تتجاوز المصنفات التقليدية أحادية النموذج. تقدم هذه الدراسة تقييمًا شاملًا لستة بنى تعلّم آلي متميزة، تشمل أساليب التجميع التقليدية، والمصنفات القائمة على النواة، وبنى الشبكات العصبية العميقة، وذلك بتطبيقها على مجموعة بيانات PhiUSIIL التي تضم 235,795 عنوان URL غنية بالميزات المعجمية والمتعلقة بالمضيف. يطبق هذا البحث خوارزميات الغابة العشوائية (RF)، وXGBoost، والشبكات العصبية الالتفافية على مستوى الأحرف (CNN)، وشبكات الذاكرة طويلة المدى ثنائية الاتجاه (Bi-LSTM)، وآلات المتجهات الداعمة (SVM) ذات نواة دالة الأساس الشعاعي، بالإضافة إلى نموذج تجميع هجين مكدس جديد يدمج خوارزميات RF وXGBoost وSVM من خلال التعلم الفائق. أظهرت عملية التحقق المتقاطع الصارمة ذات الخمس طيات أن هذا النموذج الهجين البحثي يحقق أداءً فائقًا بدقة 98.73%، ودرجة F1 بنسبة 98.91%، ومساحة تحت منحنى ROC بنسبة 99.04%، متفوقًا على النماذج الفردية بنسبة 1.8-3.4% في المقاييس الحاسمة، مع الحفاظ على متانته في مواجهة عدم توازن الفئات (نسبة المواقع الشرعية إلى مواقع التصيد الاحتيالي 1.34:1). وكشف تحليل الميزات أن ميزات عناوين URL الهيكلية، مثل معدل استمرار الأحرف واحتمالية وجود حرف في عنوان URL، تُسهم بشكل غير متناسب في فعالية الكشف مقارنةً بمؤشرات سمعة المضيف. كما أثبت التحليل الحسابي أن أساليب تعزيز التدرج توفر أفضل توازن بين الدقة وزمن الاستجابة للنشر الفوري، في حين تتفوق البنى العميقة في التقاط الأنماط التسلسلية المعقدة، ولكنها تتكبد زمن استجابة استدلال أعلى بمقدار 4.7 مرة. يوفر هذا العمل للممارسين إرشادات مُثبتة تجريبيًا لاختيار النموذج في ظل قيود تشغيلية متنوعة، ويضع معيارًا جديدًا للذكاء الهجين في الكشف عن التهديدات السيبرانية. الكلمات المفتاحية: كشف روابط التصيد الاحتيالي؛ التعلم التجميعي الهجين؛ مجموعة بيانات PhiUSIIL؛ شبكة عصبية تلافيفية على مستوى الأحرف؛ شبكة عصبية ثنائية المدى طويلة المدى؛ التعميم المكدس؛ الأمن السيبراني.

التنزيلات

تنزيل البيانات ليس متاحًا بعد.

المراجع

[1] Kaur, K., & Jain, A. K. (2025). A Survey on Phishing Attack Taxonomy, Detection Techniques, Datasets, and Security Measures. Journal of Applied Security Research, 1-52.‏

[2] Hajgude, J., & Ragha, L. (2012, October). Phish mail guard: Phishing mail detection technique by using textual and URL analysis. In 2012 World congress on information and communication technologies (pp. 297-302). IEEE.‏

[3] Verma, R., Shashidhar, N., & Hossain, N. (2012, September). Detecting phishing emails the natural language way. In European Symposium on Research in Computer Security (pp. 824-841). Berlin, Heidelberg: Springer Berlin Heidelberg.‏

[4] Khonji, M., Iraqi, Y., & Jones, A. (2012). Enhancing phishing e-mail classifiers: A lexical url analysis approach. International Journal for Information Security Research (IJISR), 2(1/2), 40.‏

[5] James, J., Sandhya, L., & Thomas, C. (2013, December). Detection of phishing URLs using machine learning techniques. In 2013 international conference on control communication and computing (ICCC) (pp. 304-309). IEEE.‏

[6] Corona, I., Giacinto, G., & Roli, F. (2013). Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues. Information sciences, 239, 201-225.‏

[7] Connolly, A., & Atlam, H. F. (2025). Effective ensemble learning phishing detection system using hybrid feature selection. Journal of Network and Computer Applications, 104251.‏

[8] James, J., Sandhya, L., & Thomas, C. (2013). Detection of phishing URLs using machine learning techniques. In 2013 international conference on control communication and computing (ICCC) (pp. 304-309). IEEE.‏

[9] Owusu-Mensah, K., Ansong, E. D., Adu-Manu, K. S., & Yaokumah, W. (2026). A Systematic Review of Frameworks for the Detection and Prevention of Card-Not-Present (CNP) Fraud.‏

[10] Chandana, Anuradha, Naik, A. J., Kumar, A., & Banu, S. (2024, February). A framework for Twitter spam detection and reporting. In AIP Conference Proceedings (Vol. 2742, No. 1, p. 020051). AIP Publishing LLC.‏

[11] Ben Dalla, L. O. F., Medeni, T. D., Medeni, I. T., & Ulubay, M. (2025). Enhancing Healthcare Efficiency at Almasara Hospital: Distributed Data Analysis and Patient Risk Management. Economy: Strategy and Practice, 19(4), 54–72. https://doi.org/10.51176/1997-9967-2024-4-54-72

[12] Jayaprakasam, B. S., Dyavani, N. R., Mandala, R. R., Garikipati, V., Ubagaram, C., & Ogunmola, G. A. (2025). A novel cloud-driven adaboost framework for scalable and intelligent data analytics. International Journal of Parallel, Emergent and Distributed Systems, 1-29.‏

[13] Haji, V., & Abdulazeez, A. M. (2025). Multiclass Regression for Facial Beauty Prediction Based on Deep Learning Using SCUT-B 5500. The Indonesian Journal of Computer Science, 14(6).‏

[14] Degirmenci, A., & Karal, O. (2022). iMCOD: Incremental multi-class outlier detection model in data streams. Knowledge-Based Systems, 258, 109950.‏ https://doi.org/10.1016/j.knosys.2022.109950

[15] Sattar, M. U., Dattana, V., Hasan, R., Mahmood, S., Khan, H. W., & Hussain, S. (2025). Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation. Sustainability, 17(13), 5772.‏

[16] Hayes, S., Aggarwal, R., Singh, H., & Wang, H. (2026). Artificial Intelligence in Business Research: A Taxonomy of Methodological Roles and Future Directions. Available at SSRN 6024534.‏

[17] Dalla, L. O. F. B. (2020). Lean Software Development Practices and Principles in Terms of Observations and Evolution Methods to increase work environment productivity. International Journal of Engineering and Modern Technology, 6(1), 23-45.‏

[18] Joshi, S. (2025). Comprehensive Review of Artificial General Intelligence AGI and Agentic GenAI: Applications in Business and Finance. Available at SSRN 5250611.‏

[19] Ben Dalla, L., Medeni, T. M., Zbeida, S. Z., & Medeni, İ. M. (2024). Unveiling the Evolutionary Journey based on Tracing the Historical Relationship between Artificial Neural Networks and Deep Learning. The International Journal of Engineering & Information Technology (IJEIT), 12(1), 104-110.

[20] Alsaedi, M., Ghaleb, F. A., Saeed, F., Ahmad, J., & Alasli, M. (2022). Cyber threat intelligence-based malicious URL detection model using ensemble learning. Sensors, 22(9), 3373.‏

[21] Dalla, L. O. F. B. (2020). Dorsal Hand Vein (DHV) Verification in Terms of Deep Convolutional Neural Networks with the Linkage of Visualizing Intermediate Layer Activations Detection. International Journal of Engineering and Modern Technology E-ISSN 2504-8848 P-ISSN 2695-2149 Vol 6 No 2 2020 www.iiardpub.org ‏

[22] Van Geest, R. J., Cascavilla, G., Hulstijn, J., & Zannone, N. (2024). The applicability of a hybrid framework for automated phishing detection. Computers & Security, 139, 103736.‏

[23] Dalla, L. O. F. B. (2020). Convolutional Neural Network Baseline Model Building for Person Re-Identification.‏ International Journal of Engineering and Modern Technology E-ISSN 2504-8848 P-ISSN 2695-2149 Vol. 6 No. 3 2020 www.iiardpub.org

[24] Abdul Samad, S. R., Balasubaramanian, S., Al-Kaabi, A. S., Sharma, B., Chowdhury, S., Mehbodniya, A., ... & Bostani, A. (2023). Analysis of the performance impact of fine-tuned machine learning model for phishing URL detection. Electronics, 12(7), 1642.‏

[25] Prasad, Y. B., & Dondeti, V. (2025). PDSMV3-DCRNN: A novel ensemble deep learning framework for enhancing phishing detection and URL extraction. Computers & Security, 148, 104123.‏

[26] Ben Dalla, L, O, F. (2021). Literature review (LR) on the dominant of Research methodology

[27] Karim, A. M., Karal, Ö., & Çelebi, F. V. (2018). A new automatic epilepsy serious detection method by using deep learning based on discrete wavelet transform. In Proceedings of the 3rd International Conference on Engineering Technology and Applied Sciences (ICETAS) (Vol. 4, pp. 15-18).‏

[28] Esen, F., Degirmenci, A., & Karal, O. (2021). Implementation of the object detection algorithm (YOLOV3) on FPGA. In 2021 innovations in intelligent systems and applications conference (ASYU) (pp. 1-6). IEEE.‏

[29] Tokgöz, N., Değirmenci, A., & Karal, Ö. (2024). Machine Learning-Based Classification of Turkish Music for Mood-Driven Selection. Journal of Advanced Research in Natural and Applied Sciences, 10(2), 312-328.‏

[30] Karal, Ö., & Dalla, L. O. F. B. (NOV. 2025). Lung Nodule Characterization in CT Scans Using Hybrid 3D Attention U-Net Segmentation and Transfer Learning-Based Classification Approach.‏ Volume (10), Issue (37), ISSN-3014-6266, SICST2025, www.sicst.ly

[31] Dalla, L. O. B., Karal, Ö., & Degirmenciyi, A. (2025). Leveraging LSTM for Adaptive Intrusion Detection in IoT Networks: A Case Study on the RT-IoT2022 Dataset implemented On CPU Computer Device Machine.‏ 5th International Conference on Engineering, Natural and Social Sciences, April 15-16, 2025: Konya, Turkey, 2025. Published by All Sciences Academy. https://www.icensos.com/

[32] Korkmaz, M., Kocyigit, E., Sahingoz, O., & Diri, B. (2022). A hybrid phishing detection system using deep learning-based URL and content analysis. Elektronika ir Elektrotechnika, 28(5).‏

[33] Nagy, N., Aljabri, M., Shaahid, A., Ahmed, A. A., Alnasser, F., Almakramy, L., ... & Alfaddagh, S. (2023). Phishing urls detection using sequential and parallel ml techniques: comparative analysis. Sensors, 23(7), 3467.‏

[34] Baghestan, A. B., & Rabbani, M. (2025, October). Security and Privacy in IoT-Enabled EEG Brain–Computer Interfaces: A Survey and Comparative Analysis. In 2025 9th International Conference on Internet of Things and Applications (IoT) (pp. 1-6). IEEE.‏

[35] Johnson, R. (2025). SOAR Technology and Implementation: Definitive Reference for Developers and Engineers. HiTeX Press.‏

[36] Sucharitha, M. M., Kumar, J. S., Sateesha, G., Prasad, M. R., & Basha, M. S. A. (2025). A shap-enhanced PCA-DBSCAN framework for interpretable retail customer segmentation and strategic insight. International Journal of System Assurance Engineering and Management, 1-65.‏

[37] Karim, A., Shahroz, M., Mustofa, K., Belhaouari, S. B., & Joga, S. R. K. (2023). Phishing detection system through hybrid machine learning based on URL. IEEE Access, 11, 36805-36822.‏

[38] Alsubaei, F. S., Almazroi, A. A., & Ayub, N. (2024). Enhancing phishing detection: A novel hybrid deep learning framework for cybercrime forensics. IEEE Access, 12, 8373-8389.‏

[39] Ujah-Ogbuagu, B. C., Akande, O. N., & Ogbuju, E. (2024). A hybrid deep learning technique for spoofing website URL detection in real-time applications. Journal of Electrical Systems and Information Technology, 11(1), 7.‏

[40] Ozcan, A., Catal, C., Donmez, E., & Senturk, B. (2023). A hybrid DNN–LSTM model for detecting

[41] Chen, S., Yu, L., Zhang, C., Wu, Y., & Li, T. (2023). Environmental impact assessment of multi-source solid waste based on a life cycle assessment, principal component analysis, and random forest algorithm. Journal of Environmental Management, 339, 117942.‏

[42] Pan, S., Zheng, Z., Guo, Z., & Luo, H. (2022). An optimized XGBoost method for predicting reservoir porosity using petrophysical logs. Journal of Petroleum Science and Engineering, 208, 109520.‏

[43] Anyanwu, G. O., Nwakanma, C. I., Lee, J. M., & Kim, D. S. (2022). Optimization of RBF-SVM kernel using grid search algorithm for DDoS attack detection in SDN-based VANET. IEEE Internet of Things Journal, 10(10), 8477-8490.‏

[44] Zheng, F., Yan, Q., Leung, V. C., Yu, F. R., & Ming, Z. (2022). HDP-CNN: Highway deep pyramid convolution neural network combining word-level and character-level representations for phishing website detection. Computers & Security, 114, 102584.‏

[45] Aslam, S., Aslam, H., Manzoor, A., Chen, H., & Rasool, A. (2024). AntiPhishStack: LSTM-based stacked generalization model for optimized phishing URL detection. Symmetry, 16(2), 248.‏