Generative AI for cybersecurity: crafting network intrusion datasets

Rathakrishnan, M; Gayan, S

Generative AI for cybersecurity: crafting network intrusion datasets

Files

Paper 22 - ADScAI 2025.pdf (116.82 KB)

Date

2025

Authors

Rathakrishnan, M

Gayan, S

Publisher

Department of Computer Science and Engineering

Abstract

Rapid advances in networking technology and the increasing complexity of cyberattacks have created the need for Artificial Intelligence (AI)-powered Intrusion Detection Systems (IDSs). However, the performance of AI-based IDS is limited due to the lack of labeled intrusion data, class imbalances, and restrictions to share intrusion data due to the General Data Protection Regulation (GDPR). In addition, it is expensive and risky to simulate attacks and collect intrusion data. Recently, Generative Adversarial Networks (GANs) have proven to be promising solutions for synthesizing data. However, most current GAN architectures are designed with Gaussian-like data distributions in mind (such as images) and are difficult to adapt to tabular data characterized by non-Gaussian, mixed-type, and multi-modal features. To address the above-mentioned limitations, we propose a multi-model framework based on Conditional Tabular GAN (CTGAN) [1] that can be used to synthesize network intrusion data. The framework incorporates rigorous testing and validations to ensure the accuracy and real-world applicability of synthesized intrusion data.