Phishing URL Detection Using Deep Learning: A CNN-Based Approach
##plugins.themes.bootstrap3.article.main##
Abstract
Phishing attacks continue to be a dangerous cybersecurity threat, not only by masking their identity using fake URLs but also by tricking users into unknowingly enabling the theft of sensitive data. This research developed a model based on a Convolutional Neural Network (CNN) and has the ability to classify URLs as either phishing or non-phishing. The model can classify a URL as either phishing or not using fewer than 5 layers of a CNN, while using a source dataset of 548,098 web pages (nearly 70% phishing and 30% legitimate). The model used tokenization and then embedded web pages into a matrix. The CNN used convolutional layers to extract features and then classified the web page using multiple fully connected layers. This model achieved 98% accuracy with testing. The model shows strong generalization and remains effective, even in the face of extreme class imbalance and overfitting—common but difficult challenges. Techniques such as dropout regularization and validation splits were used to, in turn, create a model with high performance, and future work may use attention mechanisms or a pre-trained model learned using transfer learning. The model built for this research provides an accessible and scalable solution for effectively detecting phishing URLs that can further cybersecurity assets
##plugins.themes.bootstrap3.article.details##
Phishing URL Detection, Machine Learning, Deep Learning, Convolutional neural network, CNN, AI







