Skip to main content

Write a PREreview

Artificial Intelligence-Driven Supervised Classification Algorithm for Website Vulnerability Detection Using MITRE NVD CVE Scores

Posted
Server
Preprints.org
DOI
10.20944/preprints202602.0475.v1

As cyber threats continue to evolve, traditional security measures often fail to detect emerging vulnerabilities in real-time, particularly for small and medium-sized enterprises with limited resources. This study develops an AI-driven supervised classification algorithm for website vulnerability detection that integrates insights from the National Vulnerability Database (NVD) and Common Vulnerability Scoring System (CVSS) scores. A dataset of 40,000 vulnerability entries was curated using reconnaissance tools including Nmap and Nessus, with HTML code snippets labeled according to severity levels. The methodology employed CodeBERT transformer models for converting raw HTML into numerical embeddings, followed by Random Forest classification trained on AWS SageMaker. A Chrome browser extension was developed to extract live webpage content and communicate with a Flask-based API hosted on Amazon EC2 for real-time inference. Following optimization through TF-IDF vectorization and hyperparameter tuning, the model achieved 66.3% accuracy with ROC-AUC values ranging from 0.60 to 0.70 across severity classes. The system successfully classifies websites into Low, Medium, or High-risk categories in real-time. This research demonstrates that supervised machine learning offers a practical, cost-effective, and auditable alternative to computationally intensive deep learning approaches, providing accessible vulnerability detection while maintaining compliance with emerging AI governance frameworks such as ISO 42001 and the NIST AI Risk Management Framework.

You can write a PREreview of Artificial Intelligence-Driven Supervised Classification Algorithm for Website Vulnerability Detection Using MITRE NVD CVE Scores. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now