(Seoyeon Choi)
1
(Songhye Kim)
1
(Jihyeon Ryu)
1,*
-
(Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 01897, Korea)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Time series, Anomaly detection, Deep learning, Computer network
1. Introduction
In today’s digital environment, the Internet has become indispensable for data sharing
and communication. However, cyber threats continue to grow in both sophistication
and diversity. Attackers frequently target digital assets to steal sensitive information,
underscoring the need for robust network security mechanisms capable of protecting
both data and core infrastructure. Network anomaly detection, which aims to identify
abnormal traffic patterns indicative of security breaches or malicious activity, is
therefore a critical component of modern cybersecurity strategies. Despite its importance,
anomaly detection faces several challenges. First, detection models often become outdated
as new types of attacks rapidly evolve. Second, the ever-increasing volume of network
traffic complicates the real-time analysis of large-scale datasets. Third, the overwhelming
dominance of normal traffic relative to attack traffic leads to severe class imbalance,
which significantly limits detection accuracy. Collectively, these challenges highlight
the limitations of traditional detection techniques and motivate the adoption of more
advanced machine learning (ML) and deep learning (DL) approaches [1]. Previous studies have developed intrusion detection systems using representative
benchmark datasets such as NSL-KDD, UNSW-NB15, and CICIDS2017, initially applying
methods such as SVM, kNN, logistic regression, and decision trees. However, these
datasets exhibit limitations, as their traffic characteristics and attack patterns
no longer sufficiently reflect the complexity of modern network environments due to
the period in which they were generated [2]. To address this gap, the present study additionally investigates recently collected
real-world time-series network datasets such as NF-UQ-NIDS-V2 and CESNET-TimeSeries24.
NF-UQ-NIDS-V2 reflects contemporary protocol configurations and incorporates diverse
attack scenarios, while CESNET-TimeSeries24 consists primarily of normal traffic and
enables the analysis of anomaly detection from a long-term temporal dependency perspective.
In early network anomaly detection research, traditional ML models such as SVM, kNN,
logistic regression, and decision trees were predominantly utilized. However, these
approaches demonstrated limited performance in high-dimensional network environments
and under severe class imbalance, resulting in high false positive rates and low recall.
To alleviate these challenges, DL-based methods such as CNN, LSTM, and GAN have been
introduced, achieving improvements in local pattern extraction, temporal dependency
modeling, and data sparsity reduction. Nevertheless, CNN-based models remain constrained
in capturing long-range sequential dependencies, LSTM-based models incur high computational
overhead when processing long sequences, and GAN-based methods often suffer from training
instability and mode collapse. These limitations have motivated the exploration of
more advanced architectures, such as Transformer-based models, contrastive representation
learning, and large-language-model (LLM)-driven multimodal anomaly detection. The
main contributions of this study are as follows:
-
First, we provide a systematic review of existing anomaly detection research centered
on widely used benchmark datasets, including NSL-KDD, UNSW-NB15, and CICIDS2017.
-
Second, instead of merely cataloging CNN-, LSTM-, and GAN-based models by architecture
and performance, we analyze the relationship between model characteristics and real-world
network attack patterns, clarifying the strengths and limitations inherent to each
approach.
-
Third, we discuss emerging research trends, including Transformer-based sequence learning,
contrastive learning for robust representation under class imbalance, and LLM-based
multimodal learning, and we outline promising future directions for next-generation
network anomaly detection.
2. Related Work
Traditional intrusion detection systems relied mainly on signature-based or statistical
approaches, which required predefined rules and exhibited poor adaptability to emerging
and unknown attacks. To address these limitations, machine learning and deep learning
techniques have been introduced for network anomaly detection, particularly focusing
on temporal dependencies and multivariate correlations in network traffic.
Early research efforts explored autoencoder-based and recurrent structures for general
time series anomaly detection. Yu et al. [3] proposed a filter-augmented autoencoder with learnable normalization to improve the
robustness of multivariate anomaly detection. Yu et al. [4] introduced DTAAD, a dual TCN attention architecture that captures both local and
global temporal dependencies. Liu et al. [5] proposed an adversarial reconstruction framework for unsupervised anomaly detection,
demonstrating improved detection accuracy through GAN-based learning. Similarly, Munir
et al. [6] presented DeepAnT, a predictive CNN model that detects anomalies by forecasting normal
behavior and computing deviation errors. These approaches achieved notable improvements
in reconstructive and predictive accuracy; however, these methods were mainly evaluated
on general industrial and sensor time-series datasets, with limited evaluation on
network traffic corpora.
To improve interpretability and temporal modeling, Marino et al. [7] proposed the Network Transformer, a self-supervised and interpretable anomaly detection
model for industrial control system traffic. Leveraging the self-attention mechanism,
NeT simultaneously captures long-range temporal dependencies and inter-feature correlations
while enabling explainable decision-making. Xu et al. [8] further introduced TGAN-AD, a hybrid Transformer–GAN model, combining the sequence
modeling strength of Transformers with GAN-based data reconstruction. These studies
demonstrate that attention-based architectures can outperform conventional CNNs and
LSTMs in handling heterogeneous and unlabeled data environments.
Other works attempted to generalize deep learning frameworks for foundation-level
modeling and hybrid training. González et al. [9] proposed Foundation Autoencoders to establish a unified representation model for
multivariate time-series anomaly detection, while Golchin and Rekabdar [10] combined reinforcement learning, variational autoencoders, and active learning for
adaptive detection under dynamic data conditions. These approaches highlight the movement
toward self-adaptive and reinforcement-driven anomaly detection models.
In contrast, ML-based intrusion detection for industrial and network data was explored
by Anton et al. [11,
12]. Their studies used SVM and Random Forest classifiers to detect anomalies in Modbus
and OPC UA traffic within industrial control networks. The results demonstrated that
statistical feature-based learning could effectively identify abnormal behaviors but
struggled to capture temporal dependencies and contextual relationships between packets.
Finally, Tscharke et al. [13] introduced a quantum autoencoder model designed for multivariate time-series anomaly
detection, presenting a quantum inspired approach aimed at improving scalability and
computational efficiency when processing large volumes of traffic data. The proposed
method showed promising results for high dimensional enterprise telemetry similar
to SAP system environments, yet its effectiveness on standard network intrusion detection
datasets has not been fully examined.
Table 1 summarizes studies using these time-series datasets. The reviewed studies were categorized
according to datasets, domains, input forms, models (backbone architectures), preprocessing
methods, learning methods, extracted time-series features, and evaluation indicators.
Table 1. Categorization of time-aeries anomaly detection studies.
|
Category
|
Reference title
|
Dataset
|
Domain
|
Model (backbone)
|
Preprocessing
|
Learning
|
Evaluation metrics
|
|
Deep Learning
|
A filter-augmented auto-encoder with learnable normalization for robust multivariate
time series anomaly detection [3]
|
SWaT, SMD, PSM, MSL, SMAP
|
ICS, Server, Satellite
|
NormFAAE (Auto-Encoder)
|
Learnable normalization; sliding window
|
Unsupervised
|
AUC, F-score, PA%K
|
|
DTAAD: Dual TCN-attention networks for anomaly detection in multivariate time series
data [4]
|
MSDS, SMD, WADI, SWaT, MSL, SMAP, MBA, UCR, NAB
|
ICS, Healthcare, General TS
|
Transformer (TCN+Attention)
|
Min–max normalization; noise injection; sliding window; POT threshold
|
Unsupervised
|
AUC, F1-score
|
|
Time series anomaly detection with adversarial reconstruction networks [5]
|
MIT-BIH ECG, SWaT
|
Healthcare, ICS
|
BeatGAN (Auto-Encoder+GAN)
|
R-peak segmentation; length standardization; normalization
|
Unsupervised
|
AUC, F1-score
|
|
DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series
[6]
|
Yahoo, Ionosphere
|
Web traffic, Signal
|
CNN
|
Sliding-window transform; normalization
|
Unsupervised
|
F1-score, Precision, Recall
|
|
Self-Supervised and Interpretable Anomaly Detection Using Network Transformers [7]
|
INL ICS PCAP
|
ICS
|
Transformer (NeT)
|
Packet parsing; rolling window; statistical feature extraction
|
Self-supervised
|
FPR, ADR
|
|
TGAN-AD: Transformer-based GAN for anomaly detection of time series data [8]
|
Yahoo / general TS
|
General TS domains
|
Transformer + GAN
|
Contextual feature extraction; normalization
|
Unsupervised
|
Strong benchmark performance
|
|
Towards Foundation Auto-Encoders for Time-Series Anomaly Detection [9]
|
Mobile ISP operational logs, KDD21 anomaly data
|
General TS
|
VAE + DCNN
|
Normalization pretrain
|
Unsupervised
|
Zero-shot anomaly detection
|
|
Anomaly Detection in Time Series Data Using Reinforcement Learning, Variational Autoencoder,
and Active Learning [10]
|
Yahoo, Data centers, sensor networks, finance
|
General TS
|
RLVAL
|
Normalization + sampling
|
Semi-supervised
|
Improved detection with few labels
|
|
Machine Learning
|
Anomaly-based Intrusion Detection in Industrial Data with SVM and Random Forests [11]
|
Modbus, OPC UA
|
ICS/OT
|
SVM, Random Forest
|
Feature selection (PCA); interpolation; missing values
|
Supervised
|
Accuracy, Precision, Recall, F1-score
|
|
Statistical / Probabilistic
|
Time is of the Essence: ML-based Intrusion Detection in Industrial Time Series Data
[12]
|
Modbus/TCP ICS Simulation
|
ICS
|
MatrixProfile, SARIMA (+LSTM comparison)
|
One-hot encoding; preliminary feature selection
|
Unsupervised
|
Accuracy, F1-score
|
|
Quantum Autoencoder for Multivariate Time Series Anomaly Detection [13]
|
Standard TS benchmarks
|
General TS
|
Quantum Autoencoder
|
Normalization
|
Unsupervised
|
Outperforms classical autoencoders
|
Table 2. Standardized comparison of representative time-series anomaly detection models
across benchmark datasets. Best values for each dataset are highlighted in bold.
|
Model (dataset)
|
AUC
|
F1
|
Precision
|
Recall
|
Accuracy
|
PA%K
|
FPR
|
ADR
|
Observations
|
|
NormFAAE (Avg) [3]
|
0.7059
|
0.5261
|
–
|
–
|
–
|
0.4927
|
–
|
–
|
–
|
|
DTAAD (NAB) [4]
|
0.9330
|
0.9057
|
–
|
–
|
–
|
–
|
–
|
–
|
–
|
|
BeatGAN [5]
|
0.9215$\pm$0.0003
|
0.7841$\pm$0.0009
|
–
|
–
|
–
|
–
|
–
|
–
|
–
|
|
DeepAnT (Traffic) [6]
|
–
|
0.87
|
0.50
|
0.004
|
–
|
–
|
–
|
–
|
–
|
|
NeT-Transformer [7]
|
–
|
–
|
–
|
–
|
–
|
–
|
0.0907
|
0.8471
|
–
|
|
TGAN-AD (SWaT) [8]
|
0.896
|
0.953
|
0.918
|
0.99
|
–
|
–
|
–
|
–
|
–
|
|
VAE (KDD2021) [9]
|
–
|
–
|
–
|
–
|
–
|
–
|
–
|
–
|
Tracking, zero-shot generalization, latent structure
|
|
RLVAL (Yahoo) [10]
|
–
|
0.921
|
0.894
|
0.950
|
–
|
–
|
–
|
–
|
–
|
|
SVM (DS1) [11]
|
–
|
0.852
|
0.782
|
0.936
|
0.925
|
–
|
–
|
–
|
–
|
|
RF (DS1) [11]
|
–
|
–
|
–
|
–
|
0.9984
|
–
|
–
|
–
|
–
|
|
SARIMA+LSTM (Industrial TS) [12]
|
–
|
0.90
|
–
|
–
|
0.985
|
–
|
–
|
–
|
–
|
|
QAE [13]
|
–
|
–
|
–
|
–
|
0.74
|
–
|
–
|
–
|
–
|
3. Background
3.1. Time Series
A time series is defined as a sequence of data points recorded at regular intervals.
Depending on the observation method, time series can be categorized as discrete or
continuous. Unlike ordinary data, time-series observations exhibit autocorrelation,
meaning past values are statistically related to present ones; thus, it is inappropriate
to assume independence between time points. Common examples include stock prices,
temperature records, and network traffic, where previous observations directly influence
subsequent fluctuations. To facilitate modeling, a time series is typically decomposed
into its intrinsic components. These include a trend, which captures long-term directional
changes; seasonality, which reflects recurring periodic patterns; and noise, which
represents irregular or random fluctuations. Autocorrelation primarily arises from
the trend and seasonal components, whereas noise is treated as the independent part.
By removing structural dependencies through decomposition, time-series modeling becomes
more robust and stable [14].
3.2. Type of Time series
3.2.1 Univariate time series (UTS)
A univariate time series (UTS) refers to a sequence of observations of a single variable
recorded over time. Examples include hourly temperature measurements, daily precipitation
levels in a given region, or fluctuations in stock closing prices. Formally, a UTS
can be defined as follow:
where $x_i$ represents the observations at time point $i$, and $T = \{1, 2, \dots,
t\}$ represents the entire time interval. Because a UTS includes only a single variable,
its structure is relatively simple; however, the data may still contain trends, seasonality,
and noise. UTS analysis is applied in various domains, including forecasting, anomaly
detection, and signal processing. Traditional statistical methods such as autoregressive
integrated moving average and exponential smoothing have been widely used. More recently,
deep learning techniques such as LSTM and transformer-based models have enabled accurate
modeling of long-term dependencies and nonlinear patterns, even in single-variable
data. Therefore, UTS is considered the most basic and important subject in time-series
analysis [15,
16].
3.2.2 Multivariate Time Series (MTS)
A multivariate time series (MTS) consists of multiple variables recorded simultaneously
over time. Each variable is influenced not only by its own historical values (time
dependence) but also by its relationships with other variables, often referred to
as spatial or inter-variable dependence. For instance, recording air pressure, temperature,
and humidity every hour in a given region represents a typical MTS. Formally, an MTS
can be expressed as
where $M$ denotes the number of variables and $x^m$ is the $T$-dimensional vector
of the $m$-th variable. Alternatively, an MTS can be expressed as
This can be represented as, where $x_t$ has $M$ features at time $t$. In other words,
a multivariate time series can be understood as a sequence of vectors recorded along
the time axis. Compared to univariate time series, MTSs are more complex because they
simultaneously account for inter-variable correlations and temporal dependencies [17].
3.3. Time-Series Anomaly Detection (TSAD)
Previously, we defined the various types of anomalies that can occur in time -series
data. We now consider methods designed to effectively detect such anomalies. Time-series
anomaly detection (TSAD) aims to distinguish between normal and abnormal patterns
by considering the distributional characteristics of the data, temporal dependencies,
and complex multivariate relationships. In previous studies, detection approaches
were generally categorized into statistical, machine learning-based, and deep learning-based
methods, each offering distinct advantages depending on the nature of the data and
analysis objectives [14,
18].
3.3.1 Statistical-based models
Statistics-based anomaly detection techniques use the distributional characteristics
of data to distinguish between normal and abnormal observations. These models can
be broadly divided into parametric and nonparametric categories. The former assumes
that the data follow a specific distribution, whereas the latter makes no prior distributional
assumptions [19].
1. Parametric models A parametric model assumes that data are generated from a known probability distribution
and operates by estimating the parameters of that distribution from training data.
This approach has the advantage that the model complexity remains constant regardless
of data size, making it suitable for large datasets, and verification can be performed
efficiently. However, the most important factor in this type of model is the assumption
of the correct distribution, which requires prior knowledge of the data distribution.
Detection performance may degrade significantly if the actual data deviate from this
assumption. Typically, a Gaussian distribution is assumed for continuous data, with
mean and covariance estimated through maximum likelihood estimation (MLE). For categorical
data, a multinomial distribution is often used, and for sequential data, a Markov
model is generally considered suitable. In practice, however, many datasets exhibit
complex structures that cannot be explained by a single distribution. In such cases,
mixture models are applied. A representative example is the Gaussian mixture model
(GMM), which combines several Gaussian distributions to more precisely estimate the
underlying data distribution.
2. Non-parametric model A non-parametric model estimates the distribution directly from observed data without
assuming that the data follow a specific distribution. This approach has the advantages
of greater flexibility, higher autonomy, and effectiveness in capturing complex data
structures because it is not constrained by prior distributional assumptions. However,
as the size and dimensionality of the data increase, computational costs increase
rapidly, and prior knowledge of distance-based similarity measures may be required.
Representative techniques include histogram analysis, kernel density estimation (KDE),
and the Parzen Window method. Although nonparametric models are highly flexible, they
face challenges in computational efficiency when processing high-dimensional or large-scale
datasets.
3.3.2 Machine learning models
1. Unsupervised learning Unsupervised learning is applied when only input data are available and no output
or labels are provided. The model identifies the inherent structure of the data and
extracts hidden patterns . This method has the advantage of automatically detecting
new types of abnormal behavior without requiring labeled anomalies. Therefore, it
is often used to build a collection of abnormal behaviors through system state monitoring
or past time-series analysis and can subsequently be extended to the training data
of supervised learning models.
2. Supervised learning Supervised learning is used when both input data and corresponding output labels
are available. Normal and abnormal cases are trained on clearly labeled datasets,
enabling the model to learn the boundaries between them and classify new input data.
This approach generally achieves high accuracy and is particularly effective in detecting
precursors of anomalies that appear before specific events. However, its application
is limited because it requires a sufficient amount of labeled abnormal data.
3. Semi-supervised learning Semi-supervised learning is applied when only normal data are labeled and provided.
After the model learns normal patterns, it identifies data that deviate from these
patterns as abnormal. This approach is the most widely used time series anomaly detection
in the literature and is suitable for realistic scenarios where normal data are abundant
but abnormal data are scarce. Although often grouped with unsupervised learning, it
is distinguished by its reliance on prior knowledge of normal data.
4. Reinforcement learning Reinforcement learning is a method in which a system interacts with a dynamic environment
under a reward-and-punishment mechanism and learns the optimal behavioral strategy.
In this process, goal achievement is evaluated using a value function, and the model
gradually learns the optimal policy based on this feedback. Unlike traditional supervised
or unsupervised learning, reinforcement learning improves performance through continuous
interaction with the environment. Time-series data are mainly applied in dynamic decision-making
problems such as autonomous control, resource management, and learning response strategies
in the presence of anomalies [20].
3.3.3 Deep learning models
Deep-learning-based time-series anomaly detection techniques aim to learn normal patterns
and then identify data that deviate from those patterns as anomalies. This approach
is generally divided into prediction-based, forward-based, and reconstruction-based
techniques. More recently, encoding- and distance-based methods have also been proposed,
expanding the scope of application. Prediction-based techniques forecast future values
from past data and use the difference between predicted and actual values as an index
to determine abnormality. In contrast, reconstruction-based techniques compress normal
data into a latent space and attempt to restore it, detecting anomalies when the reconstruction
error is large. Encoding-based techniques convert data into low-dimensional representations
and calculate anomaly scores directly from the latent representation without a restoration
process. Distance-based techniques compute the similarity or distance between data
points and identify those far from the normal range as anomalies. These methods differ
in several aspects, including learning paradigm (supervised, unsupervised, semi-supervised,
or self-supervised), anomaly score calculation (prediction error, reconstruction error,
latent representation, or distance), and data representation (sliding window, space-time
graph, or embedding). Each approach has distinct advantages and disadvantages. Therefore,
deep learning-based time-series anomaly detection is not limited to a single approach;
instead, models are selected and applied according to the data characteristics and
application environment [21].
4. Datasets for Time-Series NIDS
Representative public datasets are widely used for the performance evaluation of network
intrusion detection systems (NIDS).
4.1. UNSW-NB
Fig. 1. Correlation matrix between features and the binary class label in the UNSW-NB15
dataset.
The UNSW-NB15 dataset is a benchmark dataset for intrusion detection research, created
in 2015 at the University of New South Wales (UNSW) Cyber Range Lab in Australia to
reflect real-world network conditions. Using the IXIA PerfectStorm tool, raw network
traffic consisting of both normal activity and a variety of contemporary attack types
was generated and subsequently processed using Argus and Bro-IDS to extract 49 network
flow, content-based, and time-based features. Correlation analysis revealed that attack
traffic exhibits distinctive temporal and session-level behavioral patterns. In particular,
features such as $sttl$, $state$, $ct\_dst\_sport\_ltm$, $ct\_src\_dport\_ltm$, $rate$,
and $ct\_state\_ttl$ demonstrated strong positive correlations with attack classes.
This aligns with the observation that malicious traffic typically involves frequent
repetitive connection attempts within short time intervals, abnormal session termination
states characteristic of scanning or DoS activities, and persistent unusual Time-to-Live
(TTL) patterns. Accordingly, these features serve as meaningful indicators for distinguishing
attack behavior. Conversely, features such as $proto\_freq$, $id$, $swin$, $dwin$,
$dload$, $stcpb$, $dtcpb$, $tcprtt$, and $synack$ exhibited negative correlations
with normal traffic. Normal network communications generally display stable and consistent
values for parameters such as TCP window size, data transfer volume, and round-trip
time (RTT), whereas these values tend to fluctuate irregularly during attack events.
Thus, normal traffic is characterized by session stability, while attack traffic manifests
abnormal patterns in connection initiation and session maintenance [22].
4.2. NSL-KDD
Fig. 2. Correlation matrix between features and the binary class label in the NSL-KDD
dataset.
The NSL-KDD dataset is a widely used benchmark in intrusion detection system (IDS)
research. It was reconstructed from the KDD’99 dataset to address the issue of redundant
samples and to ensure a more balanced distribution of data between training and testing
sets. The dataset consists of both normal traffic and various types of network-based
attacks, with each instance represented by 41 network connection features and a corresponding
class label indicating whether the connection is normal or an attack. In this study,
for clarity of analysis, the class labels were simplified into a binary form: normal
and attack. The features of the NSL-KDD dataset can be categorized into four major
groups. First, the Basic Features-such as $protocol\_type$, $service$, $flag$, $src\_bytes$,
and $dst\_bytes$-represent fundamental communication characteristics, including the
protocol used and the number of bytes transmitted in each direction. Second, the Content
Features, including $num\_failed\_logins$, $logged\_in$, reflect packet content and
authentication-related activities. Third, the Traffic Statistical Feature-such as
$count$, $srv\_count$, $dst\_host\_count$, and $dst\_host\_srv\_count$-capture traffic
patterns by measuring the frequency of connections to the same host or service within
a certain time window. Finally, the Error/Reset Rate Features, including $serror\_rate$,
$srv\_serror\_rate$, and $dst\_host\_serror\_rate$, indicate abnormal handshake failures
or connection resets in TCP sessions, which are highly indicative of potential attacks.
Pearson correlation analysis between the class label and each feature revealed that
attack traffic is characterized by abnormal connection terminations and frequent session
initiation attempts. Specifically, $serror\_rate$, $srv\_serror\_rate$, and $dst\_host\_serror\_rate$
showed a strong positive correlation with attacks, as these features typically increase
in attack types such as DoS and port scans, where the TCP three-way handshake fails
to complete normally and the proportion of abnormal SYN packets rises sharply. Similarly,
$count$ and $srv\_count$ exhibited a strong positive correlation with attack traffic,
consistent with the tendency of attacks to generate numerous connection attempts within
a short time period. In contrast, $logged\_in$, $srv\_diff\_host\_rate$, and $dst\_host\_same\_srv\_rate$
displayed negative correlations with attack traffic, as normal users tend to perform
successful authentications and maintain persistent connections to specific services,
whereas attack traffic often demonstrates random host and service probing behavior.to
propose future research directions for network anomaly detection [23].
4.3. CICIDS2017
The CICIDS2017 dataset is a dataset for intrusion detection research built by the
Canadian Institute for Cybersecurity (CIC) in 2017 by simulating the actual internal
network environment. This dataset includes not only normal user traffic, but also
various latest attack scenarios such as Brute Force, DoS, DDoS, Web Attack, Botnet,
and Infiltration, and the entire network flow is expressed with more than 80 feature
values. As a result of the correlation analysis, features such as $Bwd Packet Length
Std$, $PSH Flag Count$, $Packet Length Variance$, $Bwd Packet Length Max$, and $Avg
Bwd Segment Size$ exhibited strong positive correlations with attack traffic. This
is attributable to the abnormal transmission behavior commonly observed during attacks,
where packet sizes and segment lengths vary significantly rather than remaining stable.
For example, in DDoS or port scanning attacks, large volumes of irregular packets
are repeatedly transmitted within short time intervals, leading to fluctuating and
unstable response traffic on the server side. Such instability is reflected in packet
length-based metrics through increased variance and elevated mean values. Conversely,
features such as $Min Packet Length$, $Bwd Packet Length Min$, $URG Flag Count$, and
$Fwd Packet Length Std$ showed negative correlations with normal traffic. In typical
network communication, packet lengths tend to remain within stable and predictable
ranges, and session flows are maintained consistently. As a result, minimum packet
size and packet length variance do not fluctuate substantially under normal conditions.
Additionally, control flags such as URG rarely appear in regular traffic, and TCP
session flows usually exhibit orderly progression. Therefore, these features capture
the inherent stability and consistency characteristic of benign network traffic and
function as key indicators for distinguishing it from attack behavior. [24].
Fig. 3. Correlation matrix between features and the binary class label in the CICIDS2017
dataset.
5. Model
According to recent surveys, CNNs and LSTMs are the most widely used models for anomaly
detection on benchmark datasets such as NSL-KDD, UNSW-NB15, and CICIDS2017. These
datasets consist of large-scale network traffic, where each flow contains dozens of
continuous and time-series attributes, including packet length, session duration,
transmission speed, and flag state. These data reflect normal traffic patterns and
repetitive behaviors, whereas rapid changes or abnormal distributions may occur at
specific times or intervals. In addition, the datasets are imbalanced, with normal
traffic accounting for the majority of records while attack data exist in various
forms, such as DoS, brute force, and web attacks. Therefore, network traffic exhibit
multidimensional, time-series, and imbalanced characteristics simultaneously, which
are key factors that must be considered when designing models for anomaly detection.
Table 3 categorizes various NIDS models by approach, architecture, key techniques, dataset,
learning type, preprocessing or feature selection, and reported metrics. The symbol
“–” indicates that the corresponding information was not specified in the original
paper.
Table 3. Categorization of NIDS models by learning approach and architecture.
|
Approach
|
Main architecture
|
Model (key techniques)
|
Dataset
|
Learning
|
Preprocessing / feature selection
|
Metrics (reported)
|
|
Classical ML
|
DT
|
DT(J48) + BestFirst FS + ANN/KNN/SVM/RF/NB [30]
|
UNSW-NB15
|
Supervised
|
One-hot encoding, Min-Max normalization
|
ACC 86.41%, ADR 97.95%, FAR 27.73%
|
|
RF
|
GA-based feature selection (C4.5, RF, NBTree) [31]
|
UNSW-NB15
|
Supervised
|
Feature selection with GA, transformation
|
ACC 81.42%, FAR 6.39%
|
|
SVM
|
XGBoost, SVM [32]
|
CICIDS2017
|
Supervised
|
Dataset aggregation, configuration
|
ACC 99.11%, F1-score 99.19%
|
|
Deep learning
|
CNN
|
Residual CNN + RNN [33]
|
UNSW-NB15, NSL-KDD
|
Supervised
|
One-hot encoding, standardization
|
DR 97.75%, ACC 86.64%, FAR 1.30%
|
|
CNN
|
CNN [34]
|
UNSW-NB15
|
Supervised
|
Value cleaning, MinMaxScaler, RF-based FS
|
ACC 99%, Precision 89%, Recall 99%, F1-score 94%
|
|
CNN+LSTM
|
CNN + LSTM + Attention [35]
|
UNSW-NB15
|
Supervised
|
ANOVA F-test, Min-Max, one-hot encoding
|
F1-score 83%, AUC 88%, Precision 86%, Recall 82%
|
|
LSTM/FNN
|
LSTM, FNN [36]
|
CICIDS2017, CTU-13
|
Supervised
|
NetFlow feature extraction, missing value removal
|
F1-score 99.703%
|
|
MLP/1D-CNN
|
MLP, 1D CNN, LOF, OCSVM [37]
|
CICIDS2017
|
Supervised
|
Data cleaning, standardization
|
ACC 97.75%, Precision 98.94%, Recall 90.36%, F1-score 94.46%
|
|
BiLSTM
|
Lightweight CNN + BiLSTM + $\chi^2$ FS [38]
|
UNSW-NB15
|
Supervised
|
Categorical encoding, normalization
|
ACC 97.90%, Precision 97.91%, Recall 97.90%, F1-score 97.90%
|
|
ARN (RNN)
|
ARN-based IDS [39]
|
SWaT, UNSW-NB15
|
Supervised
|
Word embedding, normalization
|
ACC 95.48%, P recision 94.96%, Recall 95.45%, F1-score 95.2%
|
|
DBN
|
Deep Belief Network (DBN) [40]
|
CICIDS2017
|
Supervised
|
Class balancing, normalization, RBM pretraining
|
F1-score 87.3% $\rightarrow$ 94%
|
|
GRU
|
Gated Recurrent Unit (GRU) [41]
|
CICIDS2017
|
Supervised
|
Categorical encoding, standardization
|
ACC 99.69%, Precision 99.65%, Recall 99.69%, F1-score 99.70%
|
|
CapsNet
|
Capsule Network (CapsNet) [42]
|
UNSW-NB15
|
Supervised
|
One-hot encoding, normalization
|
ACC 99%, Precision 98%, Recall 99%, F1-score 98%
|
|
DGM
|
Data Generative Model [43]
|
CICIDS2017
|
Semi-supervised
|
Oversampling rare attack types
|
F1-score 99.92%
|
|
GAN
|
Generative Adversarial Network (GAN) for IDS [44]
|
CICIDS2017
|
Semi-supervised
|
–
|
–
|
|
LSTM
|
LSTM with categorical embedding [45]
|
UNSW-NB15
|
Supervised
|
Embedding encoding, normalization
|
ACC 99.7%
|
|
CNN+ BiLSTM
|
Lightweight CNN + BiLSTM [46]
|
UNSW-NB15
|
Supervised
|
Categorical encoding, normalization
|
ACC 97%
|
|
DNN
|
Self-supervised contrastive DNN [47]
|
UNSW-NB15
|
Self-supervised
|
Contrastive pretraining, normalization
|
ACC 94%
|
|
Hybrid / ensemble
|
Multi-stage
|
Multi-stage ML pipeline (Oversampling + IG/Correlation FS + HPO) [48]
|
UNSW-NB15
|
Supervised
|
Oversampling, IG, correlation-based FS, HPO
|
–
|
|
MLP+FS
|
MLP + IG + RF importance $\rightarrow$ RFE Hybrid FS [49]
|
UNSW-NB15
|
Supervised
|
Duplicate removal, minority resampling
|
ACC 82.25%$\rightarrow$84.24%
|
|
Ensemble
|
Logistic Regression + RF + LSTM + MLP [50]
|
NSL-KDD, UNSW-NB15
|
Supervised
|
Scaling, SMOTE, Feature engineering
|
ACC 97.7%, Recall 96.9%, Precision 99.3%
|
|
LSTM+Opt
|
LSTM + SGDM + Pruning [51]
|
UNSW-NB15
|
Supervised
|
Label encoding, Min-Max normalization
|
ACC 99.0630%, FAR 0.3913%, DR 88.3317%, F1-score 90.1209%
|
|
RF+XGB
|
Random Forest + XGBoost Ensemble [52]
|
CICIDS2017
|
Supervised
|
PCA-based feature selection
|
ACC 98.05%
|
|
SAE+SVM
|
Stacked Autoencoder + SVM [53]
|
UNSW-NB15
|
Supervised
|
Feature normalization
|
ACC 87%, Precision 82%, Recall 79%, F1- score 81%
|
|
CART+RF
|
CART + Random Forest Hybrid IDS [54]
|
UNSW-NB15, CICIDS2017
|
Supervised
|
Feature importance, normalization
|
ACC 96%
|
|
GAN+CNN+ BiLSTM
|
GAN-based synthetic sampling + CNN+BiLSTM [55]
|
CICIDS2017
|
Supervised
|
Oversampling with GAN
|
Precision 100%, Recall 77%, F1-Score 87%
|
|
Transformer+ CART
|
Transformer + wrapper-based FS [56]
|
UNSW-NB15
|
Supervised
|
CART wrapper
|
Acc 93%, Precision 91%, Recall 92%, F1-score 92%
|
|
XGB
|
XGBoost + Optimized Sequential Neural Network [57]
|
NSL-KDD, UNSW-NB15, CICIDS2017
|
Supervised
|
Grid Search HPO, filtering, normalization
|
ACC 99.93%, F1 99.84%, MCC 99.86%, FPR 0.0004%
|
|
Others
|
Autoencoder
|
Autoencoder (AE), baseline: One-Class SVM [58]
|
CICIDS2017
|
Semi / Unsupervised
|
Train on normal data only
|
Zero-day ACC 75%–98%
|
|
KNN-based
|
IPCA + Self-Adjusting Memory KNN (SAM-KNN) [59]
|
UNSW-NB15
|
Supervised
|
SMOTE, normalization, oversampling
|
ACC 98.91%, Precision 98.97%, Recall 99.50%, F1-score 99.23%
|
|
DNN+Tree
|
ReLU-based DNN + Extra Tree [60]
|
UNSW-NB15
|
Supervised
|
Dimensionality reduction
|
ACC 97.93%, Recall 97%, Precision 97%, F1-score 97%
|
|
CE-GAN
|
CE-GAN based data augmentation for IDS [61]
|
NSL-KDD, UNSW-NB15
|
Supervised
|
Conditional GAN augmentation
|
PRD 66.1375%, RMSE 0.2243%, MAE 0.1361%
|
|
RNN Comparative
|
Comparative study of RNN models [62]
|
CICIDS2017, NSL-KDD, UNSW-NB15
|
Supervised
|
Standardization
|
Accuracy across variants
|
|
DNN
|
Class-wise focal-loss VAE with DNN for IoT IDS [63]
|
NSL-KDD
|
Supervised
|
Class-wise Focal Loss, data augmentation
|
ACC 88.08 %, FPR 3.77 %, U2R 79.25 %, R2L 67.5 %
|
|
DNN
|
Improved Conditional VAE + DNN for DDoS detection [64]
|
NSL-KDD, UNSW-NB15
|
Supervised
|
Conditional VAE, DNN classifier
|
ACC 99.47%, Recall 0.994%, Precision 0.995%
|
|
Fed-Unsupervised-KMeans
|
Federated unsupervised clustering IDS using k-means [65]
|
CICIDS2017, UNSW-NB15
|
Unsupervised / Federated
|
Data normalization, unsupervised variable selection, silhouette-based clustering
|
Clustering silhouette analysis across datasets
|
5.1. CNN-based Models
The main reason for using CNN in this study is the multivariate structure and temporal
continuity of network traffic, as well as the severe class imbalance commonly found
in intrusion detection datasets. In datasets such as UNSW-NB15 and NF-UQ-NIDS-V2,
normal traffic accounts for the vast majority of samples, while attack instances are
relatively scarce. Moreover, each flow contains many interdependent continuous and
categorical features, including packet length, transmission rate, session duration,
and TCP flag combinations. Because these features interact with each other, it is
difficult for traditional statistical analysis or single-feature-based detection methods
to accurately identify attack behavior. CNN provides two main advantages in addressing
these challenges. First, 1D convolution allows CNN to capture local temporal patterns
that appear within short time intervals. Many attacks do not change the overall traffic
distribution but instead occur as short bursts, repeated attempts, or sudden changes
in specific feature combinations. For example, port-scanning produces repeated connection
attempts within very short intervals; DDoS attacks generate large bursts of packets
with specific flag patterns; and web attacks may cause brief changes in payload composition.
CNN filters can learn these localized shape patterns and detect subtle anomalies that
are difficult to identify with global statistical metrics. Second, CNN can learn the
relationships between multiple features as a combined spatial pattern. In many cases,
attack traffic is characterized not only by the value of individual attributes, but
by how several attributes shift together. For instance, the simultaneous occurrence
of a particular TCP flag pattern and a sudden collapse in the packet-to-byte ratio
is difficult to detect using a single-feature threshold. However, CNN can model this
as one integrated pattern. Through this process, irrelevant features are suppressed
and important discriminative features are emphasized, allowing the model to maintain
high generalization performance even when the dataset is highly imbalanced [66]. However, CNN has limitations. Since convolution mainly focuses on local patterns,
it is not suitable for modeling long-term dependencies or global changes in traffic
behavior. Therefore, in attack scenarios where abnormal signals accumulate slowly
over a long period, such as low-rate or stealthy scanning, CNN alone may not be sufficient
[25,
33,
34].
5.2. LSTM-based Models
Long short-term memory (LSTM) is a recurrent neural network structure proposed to
solve the vanishing gradient problem of the existing Recurrent Neural Network (RNN),
and it effectively preserves long-term dependence in a time series through a memory
cell state composed of an input gate, a forget gate, and an output gate. Thanks to
this structure, LSTM can maintain important temporal information for a long time while
selectively removing unnecessary information, allowing pattern learning to be more
stable than that of a general RNN.
Although signs of attacks in network traffic often appear rapidly at a single point
in time, in actual environments they more commonly change gradually over time or appear
in the form of specific pattern repetitions. For example, low-speed port scanning
exhibits repeated connection attempts at regular intervals, and C2 (command and control)
beacon signals show periodic communication patterns within a long session. In addition,
data leakage attacks can be carried out in such a way that the amount of transmitted
data gradually increases or the session becomes abnormally long. These anomalies may
appear normal when only individual packets are examined, but abnormalities can be
identified by considering temporal changes throughout the entire session [26,
36]. However, LSTM also has its limitations. First, as the length of the time sequence
increases, the amount of computation accumulates, resulting in scalability problems
such as increased computational cost and processing delay when handling large real-time
traffic streams. In addition, although long-term dependence can theoretically be learned,
in real complex network environments long-term information is gradually diluted, and
complete long-term context preservation is not guaranteed. For this reason, recent
studies have actively explored approaches that expand LSTM into GC-LSTM structures
combined with Graph Convolutional Networks, or into transformer-based models that
can learn long-term dependence more efficiently [67,
68].
5.3. GAN-based Models
There is a structural problem in the network intrusion detection dataset. Most of
the traffic generated in the real network environment is normal communication, and
attack behavior accounts for only an extremely low percentage. This attack sample
scarcity limits the opportunity for the model to fully learn attack patterns, greatly
degrading the detection performance of new or modified attacks. In addition, there
is a class imbalance problem in which the ratio between normal traffic and attack
traffic is extremely asymmetric. In particular, since there are very few rare attack
classes such as Botnet and Infiltration in the dataset, supervised learning-based
detection models tend to be trained with a bias toward normal traffic, which eventually
leads to a decrease in the recall of the attack class [44]. This data structural problem is more evident in the CESNET-TimeSeries24 dataset.
CESNET-TimeSeries24 is a purely normal-based time-series dataset collected over a
long period of time, with few samples labeled as attacks. In this case, the model
learns only the distribution of the normal pattern narrowly, so even if an actual
attack occurs, it is highly likely that it will not be detected but instead treated
as a temporary fluctuation of the normal pattern. In particular, the abnormal patterns
appearing in this dataset are not numerical outliers at a single point in time but
temporal and continuous pattern changes such as changes in average delay time, packet
burst occurrences, collapsed traffic periodicity, and inter-arrival time distortion.
These changes cannot be reproduced with traditional oversampling techniques such as
simple feature replication or SMOTE [27].
In this study, a Generative Adversarial Network (GAN) was used to solve this problem.
A GAN can probabilistically generate new attack samples while maintaining the distributed
characteristics and intrinsic patterns of real data through competitive learning between
the generator and discriminator. Unlike simple data replication, GAN-generated samples
contain finely deformed forms while preserving existing attack patterns, helping the
detection model learn more generalized representations of attack behavior. As a result,
stable learning is possible even in rare attack classes, and the recall and F1-score
of the detection model are significantly improved.
However, there are also limitations in applying GANs. First, there is a risk that
the generator will repeatedly create only a few patterns due to the mode collapse
phenomenon. In this case, the diversity of generated data is low, so it does not sufficiently
reflect the various modified attacks that can occur in the real network environment.
Second, advanced GAN models do not always guarantee better performance depending on
the complexity of the data distribution. In this study, CTGAN was advantageous for
learning complex distributions, but Vanilla GAN and WGAN worked more stably in segments
with simple distributions. Third, it was observed that the increase in the amount
of generated data was not proportional to the performance improvement. Initially,
the performance improved significantly when the attack data was expanded to a level
of four times, but the improvement gradually decreased even when increased to 49 times
and 99 times [28,
29].
Taken together, GAN is a valid approach to alleviating the problem of rare attack
data scarcity and class imbalance and to improving the generalization ability of detection
models. However, issues such as securing generative data diversity, selecting models
based on data distribution characteristics, and optimizing the production volume remain
important challenges to be solved. [55,
69]
6. Challenges and Future Directions
Traditional deep learning-based intrusion detection studies have primarily employed
CNN, LSTM, and GAN architectures. However, these models fundamentally struggle to
capture the high-dimensional correlation structures, multi-protocol interactions,
and dynamic temporal evolution inherent in network traffic data. CNN-based models
demonstrate strong performance in extracting local features through convolutional
filters, yet they are limited in modeling long-range behavioral patterns that accumulate
progressively across a session. LSTM models can preserve sequential information through
recurrent structures; however, as sequence length increases, they suffer from gradient
vanishing and significant computational overhead, which makes them impractical for
high-speed and large-scale real-time network environments. GAN-based approaches can
help mitigate data imbalance for rare and zero-day attacks, yet they often exhibit
unstable training behavior and frequent mode collapse, hindering their ability to
capture diverse attack patterns accurately.
In contrast, Transformer architectures leverage self-attention mechanisms to directly
model global dependencies within input sequences, effectively mitigating the local-pattern
bias of CNNs and the long-range dependency issues of LSTMs. This structural advantage
enables richer representation of complex feature relationships in network traffic,
such as protocol-field interactions, payload-level flow behaviors, and cross-session
correlations. However, Transformer-only models still experience performance degradation
under extreme class-imbalance conditions, particularly when detecting rare attack
events [73].
To address this limitation, contrastive learning-based representation methods have
gained increasing attention. Contrastive learning enables clear separation between
normal and abnormal traffic instances in the representation space, even with limited
labeled data. By first learning the intrinsic clustering structure of normal flows,
deviations from this structure can be effectively identified as anomalies. As a result,
these methods significantly improve generalization to rare attacks, mutated threats,
and zero-day intrusions [75].
Furthermore, recent Large Language Model (LLM)-based approaches provide strong semantic
understanding and behavioral reasoning capabilities derived from large-scale pretraining.
LLMs are highly effective in interpreting unstructured data such as logs, packet text,
and system events. Their zero-shot and few-shot learning capabilities offer robust
adaptability to new or previously unseen attack types, even under limited labeling
conditions. Additionally, LLMs support multimodal integration of packet payloads,
traffic metadata, and log information, providing superior versatility, scalability,
and explainability compared to conventional CNN, LSTM, and GAN-based models [76].
However, advancing these methodological directions requires modern datasets that accurately
reflect today’s heterogeneous, encrypted, distributed, and high-speed network environments.
Existing benchmark datasets such as NSL-KDD, UNSW-NB15, and CICIDS2017 are widely
used but remain limited due to synthetic traffic patterns and outdated attack types.
To overcome these limitations, future work will incorporate more recent real-world
datasets:
-
CESNET-TimeSeries24 (2024): A large-scale ISP backbone traffic dataset collected over
40 weeks, containing traffic from more than 275,000 active IP addresses, 6.6 billion
flows, and 4 trillion packets. Each sample is aggregated into multiresolution time-series
with twelve key statistical behavioral features. The dataset supports anomaly detection
at IP, subnet, and institutional network scales and includes point, contextual, collective,
and trend anomalies, offering a realistic benchmark for evaluating temporal anomaly
detection models [71].
-
NF-UQ-NIDS-v2 (2023–2024): A unified intrusion detection dataset that integrates diverse
traffic environments into a single large-scale benchmark. It contains 11,994,893 flow
records described by 43 statistical features, spanning 10 attack categories including
DDoS, infiltration, botnet, brute-force, and web-based exploits. The dataset consists
of 9,208,048 normal flows and 2,786,845 attack flows, providing a realistic distribution
of benign and malicious traffic. Its diversity and volume enable robust evaluation
of models under realistic multi-attack, multi-protocol conditions [72].
7. Conclusion
This study comprehensively considered network anomaly detection research trends, focusing
on major benchmark datasets such as NSL-KDD, UNSW-NB15, and CICIDS2017. As a result
of the analysis, CNN-based models showed relatively stable performance even in high-dimensional
and imbalanced data environments due to their strength in learning regional features
and hierarchical representations, but there was a limit to sufficiently reflecting
the long-term attack behavior patterns that occur throughout the session. LSTM-based
models can effectively model time series patterns and long-term dependence, but as
the sequence length increased, the computational cost increased rapidly, which limited
the application of large-scale real-time network environments. In addition, GAN-based
approaches can alleviate the data imbalance problem for rare and zero-day attacks,
but due to training instability and mode collapse problems, it was difficult to reliably
reflect the actual attack distribution. Furthermore, network traffic itself exhibits
severe class imbalance, multivariate and high-dimensional characteristics, and real-time
detection requirements, which limit the generalization performance of existing models
and their practical applicability. To compensate for these limitations, recent studies
have focused on Transformer-based time-series learning, which enables direct modeling
of global correlations and efficient parallel processing through self-attention. In
addition, by clearly separating the representation space between normal and abnormal
traffic, contrastive learning improves generalization to rare and zero-day attacks
and offers higher training stability than GAN-based augmentation methods. Moreover,
LLM-based multimodal approaches provide integrated understanding of unstructured information
such as logs, packet text, and metadata, demonstrate strong adaptability even in label-scarce
environments through zero-shot and few-shot inference, and offer superior model interpretability
compared to conventional deep learning models. Therefore, future network anomaly detection
research is expected to advance in the following directions:
-
Learning structural representations and enhancing data augmentation strategies to
address data imbalances
-
Design lightweight, real-time detection models based on Transformer and Contrastive
Learning
-
Establishment of an evaluation system considering model interpretability and real-world
deployability
Acknowledgement
This paper was supported by the Korea Institute for Advancement of Technology(KIAT)
grant funded by the Korea Government(MOTIE) (No.RS-2021-KI002499, HRD Program for
Industrial Innovation).
References
S. M. Kasongo , Y. Sun , Performance analysis of intrusion detection systems
using a feature selection method on the UNSW-NB15 dataset, Journal of Big Data, Vol.
7, pp. 105, 2020

S. García , M. Grill , J. Stiborek , A. Zunino , An empirical comparison
of botnet detection methods, Computers & Security, Vol. 45, pp. 100-123, 2014

J. Yu , X. Gao , B. Li , F. Zhai , J. Lu , B. Xue , S. Fu , C. Xiao
, A filter-augmented auto-encoder with learnable normalization for robust multivariate
time series anomaly detection, Neural Networks, Vol. 170, pp. 478-493, 2024

L. Yu , Q. Lu , Y. Xue , DTAAD: dual TCN-attention networks for anomaly detection
in multivariate time series data, Knowledge-Based Systems, Vol. 295, No. 111849, 2024

S. Liu , B. Zhou , Q. Ding , B. Hooi , Z. Zhang , H. Shen , X. Cheng
, Time series anomaly detection with adversarial reconstruction networks, IEEE Transactions
on Knowledge and Data Engineering, Vol. 35, No. 4, pp. 4293-4306, 2022

M. Munir , S. A. Siddiqui , A. Dengel , S. Ahmed , DeepAnT: A deep learning
approach for unsupervised anomaly detection in time series, IEEE Access, Vol. 7, pp.
1991-2005, 2019

D. L. Marino , C. S. Wickramasinghe , C. Rieger , M. Manic , Self-supervised
and interpretable anomaly detection using network transformers, IEEE Transactions
on Industrial Informatics, Vol. 21, No. 5, pp. 4252-4261, 2025

L. Xu , K. Xu , Y. Qin , Y. Li , X. Huang , Z. Lin , X. Ji , TGAN-AD:
transformer-based GAN for anomaly detection of time series data, Applied Sciences,
Vol. 12, No. 16, 2022

G. G. González , P. Casas , E. Martínez , A. Fernández , Towards foundation
auto-encoders for time-series anomaly detection, arXiv preprint, 2025

B. Golchin , B. Rekabdar , Anomaly detection in time series data using reinforcement
learning, variational autoencoder, and active learning, Proc. of 2024 Conference on
AI, Science, Engineering, and Technology (AIxSET), 2025

S. D. D. Anton , S. Sinha , H. D. Schotten , Anomaly-based intrusion detection
in industrial data with SVM and random forests, Proc. of the International Conference
on Software, Telecommunications and Computer Networks, 2019

S. D. Anton , L. Ahrens , D. Fraunholz , H. D. Schotten , Time is of the
essence: machine learning-based intrusion detection in industrial time series data,
Extended version of a publication in the 2018 IEEE International Conference on Data
Mining Workshops (ICDMW), pp. 1-6, 2018

K. Tscharke , M. Wendlinger , A. Ahouzi , P. Bhardwaj , K. Amoi-Taleghani
, M. Schrödl-Baumann , P. Debus , Quantum autoencoder for multivariate time series
anomaly detection, Proc. of 2025 IEEE International Conference on Quantum Computing
and Engineering (QCE), 2025

Z. Z. Darban , G. I. Webb , S. Pan , C. Aggarwal , M. Salehi , Deep learning
for time series anomaly detection: A survey, ACM Computing Surveys, Vol. 57, No. 1,
pp. 1-42, 2025

Y. Qin , D. Song , H. Chen , W. Cheng , G. Jiang , G. Cottrell , A dual-stage
attention-based recurrent neural network for time series prediction, arXiv preprint
arXiv:1704.02971, 2017

S. Hochreiter , J. Schmidhuber , Long short-term memory, Neural Computation,
Vol. 9, No. 8, pp. 1735-1780, 1997

X. Xu , H. Wang , Y. Liang , P. S. Yu , Y. Zhao , K. Shu , Can multimodal
LLMs perform time series anomaly detection?, Proc. of the ACM Web Conference, pp.
5392-5403, 2026

V. Chandola , A. Banerjee , V. Kumar , Anomaly detection: A survey, ACM Computing
Surveys, Vol. 41, No. 3, pp. 1-58, 2009

K. Haukat , T. M. Alam , S. Luo , S. Shabbir , I. Hameed , J. Li , S.
Abbas , U. Javed , , Advances in Information and Communication, Vol. 1363, 2021

G. Ciaburro , G. Iannace , Machine learning-based algorithms to knowledge extraction
from time series data: A review, Data, Vol. 6, No. 55, 2021

F. Wang , Y. Jiang , R. Zhang , A. Wei , J. Xie , X. Pang , A survey
of deep anomaly detection in multivariate time series: taxonomy, applications, and
directions, Sensors, Vol. 25, No. 190, 2025

N. Moustafa , J. Slay , UNSW-NB15: a comprehensive data set for network intrusion
detection systems (UNSW-NB15 network data set), Proceedings of the 2015 Military Communications
and Information Systems Conference (MilCIS), pp. 1-6, 2015

M. Tavallaee , E. Bagheri , W. Lu , A. A. Ghorbani , A detailed analysis
of the KDD CUP 99 data set, Proc. of the 2009 IEEE Symposium on Computational Intelligence
for Security and Defense Applications, pp. 1-6, 2009

I. Sharafaldin , A. H. Lashkari , A. A. Ghorbani , Toward generating a new
intrusion detection dataset and intrusion traffic characterization, Proc. of the International
Conference on Information Systems Security and Privacy (ICISSP), Vol. 1, pp. 108-116,
2018

S. Bai , J. Z. Kolter , V. Koltun , An empirical evaluation of generic convolutional
and recurrent networks for sequence modeling, arXiv preprint arXiv:1803.01271, 2018

B. Radford , L. Apolonio , A. Trias , J. Simpson , Network traffic anomaly
detection using recurrent neural networks, arXiv preprint arXiv:1803.10769, 2018

B. Zhou , S. Liu , B. Hooi , X. Cheng , J. Ye , BeatGAN: anomalous rhythm
detection using adversarially generated time series, Proc. of the 28th International
Joint Conference on Artificial Intelligence (IJCAI), pp. 4433-4439, 2019

I. Sharafaldin , A. Gharib , A. H. Lashkari , A. A. Ghorbani , Towards a
reliable intrusion detection benchmark dataset, Software Networking, Vol. 2018, No.
1, pp. 177-200, 2018

A. Thakkar , R. Lohiya , A review on machine learning and deep learning perspectives
of IDS for IoT: recent updates, security issues, and challenges, Archives of Computational
Methods in Engineering, Vol. 28, No. 4, pp. 3211-243, 2021

M. A. Umar , Z. Chen , Y. Liu , Network intrusion detection using wrapper-based
decision tree for feature selection, Proc. of the 2020 International Conference on
Internet Computing for Science and Engineering, pp. 5-13, 2020

C. Khammassi , S. Krichen , A GA-LR wrapper approach for feature selection in
network intrusion detection, Computers Security, Vol. 70, pp. 255-277, 2017

S. Farhat , M. Abdelkader , A. Meddeb-Makhlouf , F. Zarai , Evaluation of
DoS/DDoS attack detection with ML techniques on CIC-IDS2017 dataset, Proc. of the
International Conference on Information Systems Security and Privacy (ICISSP), pp.
287-295, 2023

P. Wu , H. Guo , N. Moustafa , Pelican: A deep residual network for network
intrusion detection, Proc. of the 2020 IEEE/IFIP International Conference on Dependable
Systems and Networks Workshops (DSN-W), pp. 55-62, 2020

A. D. Vibhute , M. Khan , C. H. Patil , S. V. Gaikwad , A. V. Mane , K.
K. Patel , Network anomaly detection and performance evaluation of convolutional
neural networks on UNSW-NB15 dataset, Procedia Computer Science, Vol. 235, pp. 2227-2236,
2024

K. Psychogyios , A. Papadakis , S. Bourou , N. Nikolaou , A. Maniatis ,
T. Zahariadis , Deep learning for intrusion detection systems (IDSs) in time series
data, Future Internet, Vol. 16, No. 3, 2024

A. Corsini , S. J. Yang , G. Apruzzese , On the evaluation of sequential machine
learning for network intrusion detection, Proc. of the 16th International Conference
on Availability, Reliability and Security (ARES), pp. 1-10, 2021

Z. Xu , Y. Liu , Robust anomaly detection in network traffic: evaluating machine
learning models on CICIDS2017, Proc. of 2025 10th International Conference on Electronic
Technology and Information Science (ICETIS), 2025

M. Jouhari , H. Benaddi , K. Ibrahimi , Efficient intrusion detection: combining
X2 feature selection with CNN-BiLSTM on the UNSW-NB15 dataset, Proc. of the 2024 11th
International Conference on Wireless Networks and Mobile Communications (WINCOM),
pp. 1-6, 2024

Z. Liu , D. Ye , C. Yang , Y. Ding , Y. Liu , L. Tang , C. Chen , Simplicity
over complexity: an ARN-based intrusion detection method for industrial control network,
arXiv preprint arXiv:2412.14669, 2024

O. Belarbi , A. Khan , P. Carnelli , T. Spyridopoulos , An intrusion detection
system based on deep belief networks, Proc. of 4th International Conference on Science
of Cyber Security, pp. 377-392, 2022

B. Cao , C. Li , Y. Song , Y. Qin , C. Chen , Network intrusion detection
model based on CNN and GRU, Applied Sciences, Vol. 12, pp. 4184, 2022

M. Khan , A. Rahman , S. Lee , Improving intrusion detection with hybrid deep
learning models: A study on CIC-IDS2017, UNSW-NB15, and KDD CUP 99, Journal of Information
Systems Engineering and Management, Vol. 10, No. 11s, pp. 1-12, 2025

A. S. BBarkah , S. R. Selamat , Z. Z. Abidin , R. Wahyudi , Data generative
model to detect the anomalies for IDS imbalance CICIDS2017 dataset, TEM Journal, Vol.
12, No. 1, pp. 1-7, 2023

M. Al-Ajlan , M. Ykhlef , A review of generative adversarial networks for intrusion
detection systems: advances, challenges, and future directions, Computers, 2024

H. Gwon , C. Lee , R. Keum , H. Choi , Network intrusion detection based
on LSTM and feature embedding, arXiv preprint, 2019

M. Jouhari , M. Guizani , Lightweight CNN-BiLSTM based intrusion detection systems
for resource-constrained IoT devices, Proc. of the 2024 International Wireless Communications
and Mobile Computing Conference (IWCMC), pp. 1558-1563, 2024

S. Lotfi , M. Modirrousta , S. Shashaani , M. A. Shoorehdeli , Network intrusion
detection with limited labeled data using self-supervision, arXiv preprint, 2022

M. Injadat , A. Moubayed , A. B. Nassif , A. Shami , Multi-stage optimized
machine learning framework for network intrusion detection, IEEE Transactions on Network
and Service Management, Vol. 18, No. 2, pp. 1803-1816, 2020

Y. Yin , J. Jang-Jaccard , W. Xu , A. Singh , J. Zhu , F. Sabrina , J.
Kwak , IGRF-RFE: a hybrid feature selection method for MLP-based network intrusion
detection on UNSW-NB15 dataset, Journal of Big Data, Vol. 10, No. 1, 2023

B. Tafreshian , S. Zhang , A defensive framework against adversarial attacks
on machine learning-based network intrusion detection systems, Proc. of the 2024 IEEE
23rd International Conference on Trust, Security and Privacy in Computing and Communications
(TrustCom), pp. 2436-2441, 2024

T. T. Huynh , T. Nguyen Hoang , Effective multi-stage training model for edge
computing devices in intrusion detection, International Journal of Computer Networks
Communications, Vol. 16, 2024

C. S. Sampath , P. Anuradha , Intrusion detection using machine learning: A random
forest-based approach, International Journal for Multidisciplinary Research, Vol.
5, No. 3, pp. 1-6, 2023

N. Fathima , A. Pramod , Y. Srivastava , A. M. Thomas , Two-stage deep stacked
autoencoder with shallow learning for network intrusion detection system, arXiv preprint,
2021

R. Mohammad , F. Saeed , A. A. Almazroi , F. S. Alsubaei , A. A. Almazroi
, Enhancing intrusion detection systems using a deep learning and data augmentation
approach, Systems, Vol. 12, No. 3, 2024

X. Zhao , K. W. Fok , V. L. Thing , Enhancing network intrusion detection performance
using generative adversarial networks, Computers Security, Vol. 145, 2024

M. Umer , M. Tahir , M. Sardaraz , M. Sharif , H. Elmannai , A. D. Algarni
, Network intrusion detection model using wrapper based feature selection and multi
head attention transformers, Scientific Reports, Vol. 15, No. 1, 2025

F. S. Alsubaei , Smart deep learning model for enhanced IoT intrusion detection,
Scientific Reports, Vol. 15, No. 1, 2025

H. Hindy , R. Atkinson , C. Tachtatzis , J. N. Colin , E. Bayne , X. Bellekens
, Utilising deep learning techniques for effective zero-day attack detection, Electronics,
Vol. 9, No. 10, 2020

P. R. Agbedanu , R. Musabe , J. Rwigema , I. Gatare , Y. Pavlidis , IPCA-SAMKNN:
A novel network IDS for resource constrained devices, Proc. of the 2022 2nd International
Seminar on Machine Learning, Optimization, and Data Science (ISMODE), pp. 540-545,
2022

M. Farhan , H. Waheed Ud Din , S. Ullah , M. S. Hussain , M. A. Khan ,
T. Mazhar , Network-based intrusion detection using deep learning technique, Scientific
Reports, Vol. 15, No. 1, 2025

Y. Yang , X. Liu , D. Wang , Q. Sui , C. Yang , H. Li , A CE-GAN-based
approach to address data imbalance in network intrusion detection systems, Scientific
Reports, Vol. 15, No. 1, 2025

M. Tayebi , S. El Kafhali , Performance analysis of recurrent neural networks
for intrusion detection systems in Industrial Internet of Things, Franklin Open, Vol.
12, 2025

S. Khanam , I. Ahmedy , M. Y. I. Idris , M. H. Jaward , Towards an effective
intrusion detection model using focal loss variational autoencoder for internet of
things (IoT), Sensors, Vol. 22, No. 15, 2022

C. Haripriya , M. P. Jagadeesh , An efficient autoencoder-based deep learning
technique to detect network intrusions, International Transaction Journal of Engineering,
Management, Applied Sciences Technologies, Vol. 13, No. 7, pp. 1-10, 2022

M. Gourceyraud , R. B. Salem , C. Neal , F. Cuppens , N. B. Cuppens , Federated
intrusion detection system based on unsupervised machine learning, arXiv preprint,
2025

H. Chen , G.-R. You , Y.-R. Shiue , Hybrid intrusion detection system based
on data resampling and deep learning, International Journal of Advanced Computer Science
and Applications, Vol. 15, No. 2, 2024

W. Choukri , H. Lamaazi , N. Benamar , Abnormal network traffic detection using
deep learning models in IoT environment, Proc. of the 2021 3rd IEEE Middle East and
North Africa Communications Conference (MENACOMM), pp. 98-103, 2021

T. Sharma , S. Gandage , Network traffic classification using long-short term
memory algorithm on UNSW-NB15 and KDD-CUP99 dataset, Mathematical Statistician and
Engineering Applications, Vol. 71, No. 4, pp. 10166-10181, 2022

L. Xu , M. Skoularidou , A. Cuesta-Infante , K. Veeramachaneni , Modeling
tabular data using conditional GAN, Proc. of the 33rd International Conference on
Neural Information Processing Systems (NeurIPS), pp. 7335-7345, 2019

X. Zhao , K. W. Fok , V. L. L. Thing , Enhancing network intrusion detection
performance using generative adversarial networks, Computers Security, Vol. 145,
pp. 104005, 2024

J. Koumar , K. Hynek , T. Cejka , P. Šiška , CESNET-TimeSeries24: time series
dataset for network traffic anomaly detection and forecasting, Scientific Data, Vol.
12, No. 1, pp. 338, 2025

J. Krupski , M. Iwanowski , W. Graniszewski , Extraction of minimal set of
traffic features using ensemble of classifiers and rank aggregation for network intrusion
detection systems, Applied Sciences, Vol. 14, No. 16, pp. 6995, 2024

S.-M. Tseng , Y.-Q. Wang , Y.-C. Wang , Multi-class intrusion detection based
on transformer for IoT networks using CIC-IoT-2023 dataset, Future Internet, Vol.
16, No. 8, pp. 284, 2024

W.-S. Park , G.-N. Kim , S. Lee , Intrusion detection system based on packet
payload analysis using transformer, Journal of The Korea Society of Computer and Information,
Vol. 28, No. 11, pp. 81-87, 2023

X. Tan , J. Cheng , H. Li , Y. Yang , Contrastive learning for network intrusion
detection: A comprehensive survey, Proceedings of the 2024 2nd International Conference
on Computing, Internet of Things and Smart City (CIoTSC), pp. 160-166, 2025

M. A. Rahman , A survey on security and privacy of multimodal LLMs: connected healthcare
perspective, Proc. of the 2023 IEEE Globecom Workshops (GC Wkshps), pp. 1807-1812,
2023

Seoyeon Choi is currently an undergraduate student at Kwangwoon University, majoring
in computer information engineering. Her research interests include cryptography,
network security, and cyber security.
Songhye Kim is currently pursuing a B.S. degree in computer information engineering
at Kwangwoon University, Seoul, Republic of Korea. Her research interests include
cryptography, cybersecurity, and vulnerability analysis.
Jihyeon Ryu is an assistant professor with the School of Computer and Information
Engineering, Kwangwoon University. She received her B.S. degree in mathematics and
computer science from Sungkyunkwan University, and a Ph.D. degree in cyber security
from Sungkyunkwan University, Korea. Her research interests include cyber security,
machine learning, and user authentication.