Abstract:
Phishing, a well-known cyber-attack practice has gained signi cant research attention in the
cyber-security domain for the last two decades due to its dynamic attacking strategies. Although different
solutions have been exercised against phishing, phishing attacks have dramatically increased in the past
few years. Recent studies have shown that machine learning has become prominent in the present antiphishing
context, and the techniques like deep learning have extensively improved anti-phishing tools'
detection ability. This paper proposes PhishDet, a newway of detecting phishing websites through Long-term
Recurrent Convolutional Network and Graph Convolutional Network using URL and HTML features.
PhishDet is the rst of its kind, which uses the powerful analysis and processing capabilities of Graph Neural
Network in the anti-phishing domain and recorded 96.42% detection accuracy, with a 0.036 false-negative
rate. It is effective against zero-day attacks, and the average detection time which is 1.8 seconds could also be
considered realistic. The feature selection of PhishDet is automatic and occurs inside the system, as PhishDet
gradually learns URLs and HTML content features to handle constantly changing phishing attacks. This has
outperformed similar solutions by achieving a 99.53% f1-score with a public benchmark dataset. However,
PhishDet requires periodic retraining to maintain its performance over time. If such retraining could be
facilitated, PhishDet could ght against phishers for a more extended period to safeguard Internet users
from this Internet threat.
Citation:
Ariyadasa, S., Fernando, S., & Fernando, S. (2022). Combining long-term recurrent convolutional and graph convolutional networks to detect phishing sites using URL and HTML. IEEE Access, 10, 82355–82375. https://doi.org/10.1109/ACCESS.2022.3196018