Generative AI for cybersecurity: crafting network intrusion datasets
Loading...
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Department of Computer Science and Engineering
Abstract
Rapid advances in networking technology and the increasing complexity of cyberattacks have created the need for Artificial Intelligence (AI)-powered Intrusion Detection Systems (IDSs). However, the performance of AI-based IDS is limited due to the lack of labeled intrusion data, class imbalances, and restrictions to share intrusion data due to the General Data Protection Regulation (GDPR). In addition, it is expensive and risky to simulate attacks and collect intrusion data. Recently, Generative Adversarial Networks (GANs) have proven to be promising solutions for synthesizing data. However, most current GAN architectures are designed with Gaussian-like data distributions in mind (such as images) and are difficult to adapt to tabular data characterized by non-Gaussian, mixed-type, and multi-modal features. To address the above-mentioned limitations, we propose a multi-model framework based on Conditional Tabular GAN
(CTGAN) [1] that can be used to synthesize network intrusion data. The framework incorporates rigorous testing and validations to ensure the accuracy and real-world applicability of synthesized intrusion data.
