ML-Based NIDS Datasets

Machine Learning-Based NIDS Datasets

The datasets on this page are designed for machine learning-based Network Intrusion Detection Systems (NIDS) and are organised into the following high-level collections:

NetFlow V3 Datasets: This collection consists of four datasets in NetFlow format. This is the only version incorporating temporal features. These datasets build on the 43 features of V2 by adding 10 temporal NetFlow features, totaling 53 features. They are provided in CSV format.
NetFlow V2 Datasets: Comprising five datasets in NetFlow format, this collection extends the V1 datasets with 43 enhanced NetFlow features. They are available in CSV format.
NetFlow V1 Datasets: This collection includes five datasets converted from four diverse formats into NetFlow format, featuring 12 basic NetFlow features. They are provided in CSV format.
CICFlowMeter Datasets: Consisting of five datasets, this collection is generated using the CICFlowMeter tool by converting three datasets from various formats into CICFlowMeter format, with 80 features. They are available in CSV format.

For full details and the recommended citation for these datasets, please refer to the corresponding links in the tabs below.

NetFlow v3 Datasets

Version 3 of the NetFlow datasets is made up of 53 extended NetFlow features. The details of the datasets are published in:

Majed Luay, Siamak Layeghy, Seyedehfaezeh Hosseininoorbin, Mohanad Sarhan, Nour Moustafa, Marius Portmann, ''Temporal Analysis of NetFlow Datasets for Network Intrusion Detection Systems''. Please use the following citation to reference these datasets:

@misc{luay2025NetFlowDatasetsV3,
title = {Temporal Analysis of NetFlow Datasets for Network Intrusion Detection Systems},
author = {Majed Luay and Siamak Layeghy and Seyedehfaezeh Hosseininoorbin and Mohanad Sarhan and Nour Moustafa and Marius Portmann},
year = {2025},
eprint = {2503.04404},
archivePrefix= {arXiv},\
primaryClass = {cs.LG},
url = {https://arxiv.org/abs/2503.04404}
}

NF-UNSW-NB15-v3

Please click here to download the dataset.

The NF-UNSW-NB15-v3 dataset is a NetFlow-based version of the well-known UNSW-NB15 dataset, enhanced with additional NetFlow features and labelled according to its respective attack categories. It consists of a total of 2,365,424 data flows, where 127,639 (5.4%) are attack samples and 2,237,731 (94.6%) are benign. The attack flows are categorised into nine classes, each representing a distinct cyber threat. The table below provides a detailed distribution of the dataset:

Class	Count	Description
Benign	2,237,731	Normal unmalicious flows
Fuzzers	33,816	An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system.
Analysis	2,381	A group that presents a variety of threats that target web applications through ports, emails and scripts.
Backdoor	1,226	A technique that aims to bypass security mechanisms by replying to specific constructed client applications.
DoS	5,980	Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Exploits	42,748	Are sequences of commands controlling the behaviour of a host through a known vulnerability
Generic	19,651	A method that targets cryptography and causes a collision with each block-cipher.
Reconnaissance	17,074	A technique for gathering information about a network host and is also known as a probe.
Shellcode	4,659	A malware that penetrates a code to control a victim's host.
Worms	158	Attacks that replicate themselves and spread to other computers.

NF-ToN-IoT-v3

Please click here to download the dataset.

The NF-ToN-IoT-v3 dataset is a NetFlow-based version of the well-known ToN-IoT dataset, enhanced with additional NetFlow features and labelled according to its respective attack categories. The total number of data flows is 27,520,260 out of which 10,728,046 (38.98%) are attack samples and 16,792,214 (61.02%) are benign ones. The table below lists and defines the distribution of the NF-ToN-IoT-v3 classes.

Class	Count	Description
Benign	16,792,214	Normal unmalicious flows
Backdoor	203,384	A technique that aims to attack remote-access computers by replying to specific constructed client applications.
DoS	203,456	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	4,141,256	An attempt similar to DoS but has multiple different distributed sources.
Injection	381,777	A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones.
MITM	6,013	Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications.
Password	1,594,777	Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing.
Ransomware	3,971	An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key.
Scanning	1,358,977	A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing.
XSS	2,834,435	Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users.

NF-BoT-IoT-v3

Please click here to download the dataset.

An IoT NetFlow-based dataset was generated by expanding the NF-BoT-IoT dataset. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 16,993,808 out of which 16,881,819 (99.7%) are attack samples and 51,989 (0.3%) are benign. There are four attack categories in the dataset, the table below represents the class distribution of all flows.

Class	Count	Description
Benign	51,989	Normal unmalicious flows
Reconnaissance	1,695,132	A technique for gathering information about a network host and is also known as a probe.
DDoS	7,150,882	Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources.
DoS	8,034,190	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Theft	1,651	A group of attacks that aims to obtain sensitive data such as data theft and keylogging

NF-CSE-CIC-IDS2018-v3

Please click here to download the dataset.

The original pcap files of the CSE-CIC-IDS2018 dataset are utilised to generate a NetFlow-based dataset called NF-CSE-CIC-IDS2018. The total number of flows is 20,115,529 out of which 2,600,903 (12.93%) are attack samples and 17,514,626 (87.07%) are benign ones, the table below represents the dataset's distribution.

Class	Count	Description
Benign	17,514,626	Normal unmalicious flows
BruteForce	575,194	A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities
Bot	207,703	An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities.
DoS	302,966	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	1,324,350	An attempt similar to DoS but has multiple different distributed sources.
Infiltration	188,152	An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities
Web Attacks	2,538	A group that includes SQL injections, command injections and unrestricted file uploads

NetFlow V2 Datasets

Version 2 of the NetFlow datasets is made up of 43 extended NetFlow features. The details of the datasets are published in:

Mohanad Sarhan, Siamak Layeghy, and Marius Portmann, Towards a Standard Feature Set for Network Intrusion Detection System Datasets, Mobile Networks and Applications, 103, 108379, 2022.

Please use the following citation to reference these datasets:

@article{sarhan2022towards,
title = {Towards a Standard Feature Set for Network Intrusion Detection System Datasets},
author = {Mohanad Sarhan and Siamak Layeghy and Marius Portmann}
year = {2022},
journal = {Mobile networks and applications},
pages = {1--14},
publisher = {Springer US},
url = {https://doi.org/10.1007/s11036-021-01843-0}
}

NF-UNSW-NB15-v2

Please click here to download the dataset.

The NetFlow-based format of the UNSW-NB15 dataset, named NF-UNSW-NB15, has been expanded with additional NetFlow features and labelled with its respective attack categories. The total number of data flows is 2,390,275 out of which 95,053 (3.98%) are attack samples and 2,295,222 (96.02%) are benign. The attack samples are further classified into nine subcategories, the table below represents the NF-UNSW-NB15-v2 dataset's distribution of all flows.

Class	Count	Description
Benign	2295222	Normal unmalicious flows
Fuzzers	22310	An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system.
Analysis	2299	A group that presents a variety of threats that target web applications through ports, emails and scripts.
Backdoor	2169	A technique that aims to bypass security mechanisms by replying to specific constructed client applications.
DoS	5794	Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Exploits	31551	Are sequences of commands controlling the behaviour of a host through a known vulnerability
Generic	16560	A method that targets cryptography and causes a collision with each block-cipher.
Reconnaissance	12779	A technique for gathering information about a network host and is also known as a probe.
Shellcode	1427	A malware that penetrates a code to control a victim's host.
Worms	164	Attacks that replicate themselves and spread to other computers.

NF-ToN-IoT-v2

Please click here to download the dataset.

The publicly available pcaps of the ToN-IoT dataset are utilised to generate its NetFlow records, leading to a NetFlow-based IoT network dataset called NF-ToN-IoT. The total number of data flows is 16,940,496 out of which 10,841,027 (63.99%) are attack samples and 6,099,469 (36.01%), the table below lists and defines the distribution of the NF-ToN-IoT-v2 dataset.

Class	Count	Description
Benign	6099469	Normal unmalicious flows
Backdoor	16809	A technique that aims to attack remote-access computers by replying to specific constructed client applications.
DoS	712609	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	2026234	An attempt similar to DoS but has multiple different distributed sources.
Injection	684465	A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones.
MITM	7723	Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications.
Password	1153323	Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing.
Ransomware	3425	An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key.
Scanning	3781419	A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing.
XSS	2455020	Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users.

NF-BoT-IoT-v2

Please click here to download the dataset.

An IoT NetFlow-based dataset was generated by expanding the NF-BoT-IoT dataset. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 37,763,497 out of which 37,628,460 (99.64%) are attack samples and 135,037 (0.36%) are benign. There are four attack categories in the dataset, the table below represents the NF-BoT-IoT-v2 distribution of all flows.

Class	Count	Description
Benign	135037	Normal unmalicious flows
Reconnaissance	2620999	A technique for gathering information about a network host and is also known as a probe.
DDoS	18331847	Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources.
DoS	16673183	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Theft	2431	A group of attacks that aims to obtain sensitive data such as data theft and keylogging

NF-CSE-CIC-IDS2018-v2

Please click here to download the dataset.

The original pcap files of the CSE-CIC-IDS2018 dataset are utilised to generate a NetFlow-based dataset called NF-CSE-CIC-IDS2018-v2. The total number of flows is 18,893,708 out of which 2,258,141 (11.95%) are attack samples and 16,635,567 (88.05%) are benign ones, the table below represents the dataset's distribution.

Class	Count	Description
Benign	16635567	Normal unmalicious flows
BruteForce	120912	A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities
Bot	143097	An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities.
DoS	483999	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	1390270	An attempt similar to DoS but has multiple different distributed sources.
Infiltration	116361	An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities
Web Attacks	3502	A group that includes SQL injections, command injections and unrestricted file uploads

NF-UQ-NIDS-v2

Please click here to download the dataset.

A comprehensive dataset, merging all the aforementioned datasets. The newly published dataset represents the benefits of the shared dataset feature sets, where the merging of multiple smaller datasets is possible. This will eventually lead to a bigger and a universal NIDS dataset containing flows from multiple network setups and different attack settings. It includes an additional label feature, identifying the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different testbed networks. The attack categories have been modified to combine all parent categories. Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDoS attack-LOIC-UDP, DDoS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category. The NF-UQ-NIDS dataset has a total of 75,987,976 records, out of which 25,165,295 (33.12%) are benign flows and 50,822,681 (66.88%) are attacks. The table below lists the distribution of the final attack categories.

Class	Count
Benign	25165295
DDoS	21748351
Reconnaissance	2633778
Injection	684897
DoS	17875585
Brute Force	123982
Password	1153323
XSS	2455020
Infilteration	116361
Exploits	31551
Scanning	3781419
Fuzzers	22310
Backdoor	18978
Bot	143097
Generic	16560
Analysis	2299
Theft	2431
Shellcode	1427
MITM	7723
Worms	164
Ransomware	3425

NetFlow V1 Datasets

Version 1 of the NetFlow datasets is made up of 12 extended NetFlow features. The details of the datasets are published in:

Mohanad Sarhan, Siamak Layeghy, Nour Moustafa and Marius Portmann, NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. In: Big Data Technologies and Applications.(2021) BDTA 2020, WiCON 2020. Springer, Cham.

Please use the following citation to reference these datasets:

@inproceedings{sarhan2021netflow,
title = {{Netflow Datasets for Machine Learning-based Network Intrusion Detection Systems}},
author = {Mohanad Sarhan and Siamak Layeghy and Nour Moustafa and Marius Portmann}
year = {2021},
booktitle = {Big Data Technologies and Applications: 10th EAI International Conference, BDTA 2020, and 13th EAI International Conference on Wireless Internet, WiCON 2020, Virtual Event, December 11, 2020, Proceedings 10},
pages = {117--135},
organization = {Springer International Publishing}
url = {https://doi.org/10.1007/978-3-030-72802-1_9}
}

NF-UNSW-NB15

Please click here to download the dataset.

The NetFlow-based format of the UNSW-NB15 dataset, named NF-UNSW-NB15, has been developed and labelled with its respective attack categories. The total number of data flows is 1,623,118 out of which 72,406 (4.46%) are attack samples and 1,550,712 (95.54%) are benign. The attack samples are further classified into nine subcategories, The table below represents the NF-UNSW-NB15 dataset's distribution of all flows.

Class	Count	Description
Benign	1550712	Normal unmalicious flows
Fuzzers	19463	An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system.
Analysis	1995	A group that presents a variety of threats that target web applications through ports, emails and scripts.
Backdoor	1782	A technique that aims to bypass security mechanisms by replying to specific constructed client applications.
DoS	5051	Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Exploits	24736	Are sequences of commands controlling the behaviour of a host through a known vulnerability
Generic	5570	A method that targets cryptography and causes a collision with each block-cipher.
Reconnaissance	12291	A technique for gathering information about a network host and is also known as a probe.
Shellcode	1365	A malware that penetrates a code to control a victim's host.
Worms	153	Attacks that replicate themselves and spread to other computers.

NF-ToN-IoT

Please click here to download the dataset.

We utilised the publicly available pcaps of the ToN-IoT dataset to generate its NetFlow records, leading to a NetFlow-based IoT network dataset called NF-ToN-IoT. The total number of data flows is 1,379,274 out of which 1,108,995 (80.4%) are attack samples and 270,279 (19.6%) are benign ones, the table below lists and defines the distribution of the NF-ToN-IoT dataset.

Class	Count	Description
Benign	270279	Normal unmalicious flows
Backdoor	17247	A technique that aims to attack remote-access computers by replying to specific constructed client applications.
DoS	17717	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	326345	An attempt similar to DoS but has multiple different distributed sources.
Injection	468539	A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones.
MITM	1295	Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications.
Password	156299	Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing.
Ransomware	142	An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key.
Scanning	21467	A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing.
XSS	99944	Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users.

NF-BoT-IoT

Please click here to download the dataset.

An IoT NetFlow-based dataset was generated using the BoT-IoT dataset, named NF-BoT-IoT. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 600,100 out of which 586,241 (97.69%) are attack samples and 13,859 (2.31%) are benign. There are four attack categories in the dataset, the table below represents the NF-BoT-IoT distribution of all flows.

Class	Count	Description
Benign	13859	Normal unmalicious flows
Reconnaissance	470655	A technique for gathering information about a network host and is also known as a probe.
DDoS	56844	Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources.
DoS	56833	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Theft	1909	A group of attacks that aims to obtain sensitive data such as data theft and keylogging

NF-CSE-CIC-IDS2018

Please click here to download the dataset.

We utilised the original pcap files of the CSE-CIC-IDS2018 dataset to generate a NetFlow-based dataset called NF-CSE-CIC-IDS2018. The total number of flows is 8,392,401 out of which 1,019,203 (12.14%) are attack samples and 7,373,198 (87.86%) are benign ones, the table below represents the dataset's distribution.

Class	Count	Description
Benign	7373198	Normal unmalicious flows
BruteForce	287597	A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities
Bot	15683	An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities.
DoS	269361	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	380096	An attempt similar to DoS but has multiple different distributed sources.
Infiltration	62072	An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities
Web Attacks	4394	A group that includes SQL injections, command injections and unrestricted file uploads

NF-UQ-NIDS

Please click here to download the dataset.

A comprehensive dataset, merging all the aforementioned datasets. The newly published dataset represents the benefits of shared dataset feature sets, where the merging of multiple smaller ones is possible. This will eventually lead to a bigger and more universal NIDS dataset containing flows from multiple network setups and different attack settings. An additional label feature identifies the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different test-bed networks. The attack categories have been modified to combine all parent categories. Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDOS attack-LOIC-UDP, DDOS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category. The NF-UQ-NIDS dataset has a total of 11,994,893 records, out of which 9,208,048 (76.77%) are benign flows and 2,786,845 (23.23%) attacks. The table below lists the distribution of the final attack categories.

Class	Count
Benign	9208048
DDoS	763285
Reconnaissance	482946
Injection	468575
DoS	348962
Brute Force	291955
Password	156299
XSS	99944
Infilteration	62072
Exploits	24736
Scanning	21467
Fuzzers	19463
Backdoor	17247
Bot	15683
Generic	5570
Analysis	1995
Theft	1909
Shellcode	1365
MITM	1295
Worms	153
Ransomware	142

CICFlowMeter Datasets

CICFlowMeter datasets is made up of 80 features. The details of the datasets are published in:

Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, and Marius Portmann, Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-based Network Intrusion Detection, Big Data Research. 2022 Nov 28;30:100359. 2022.

Please use the following citation to reference these datasets:

@article{sarhan2022evaluating,
title = {{Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-based Network Intrusion Detection}},
author = {Mohanad Sarhan and Siamak Layeghy and Marius Portmann}
year = {2022},
journal = {Big Data Research},
volume = {30},
pages = {100359},
publisher = {Elsevier}
url = {https://doi.org/10.1016/j.bdr.2022.100359}
}

CIC-UNSW-NB15

Please click here to download the dataset.

The CICFlowMeter-based format of the UNSW-NB15 dataset, named CIC-UNSW-NB15, has been developed and labelled with its respective attack categories. The total number of data flows is 2,540,044 out of which 321,283 (12.65%) are attack samples and 2,218,761 (87.35%) are benign. The attack samples are further classified into nine subcategories. The table below represents the CIC-UNSW-NB15 dataset's distribution of all flows.

Class	Count	Description
Benign	2218761	Normal unmalicious flows
Fuzzers	24246	An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system.
Analysis	2677	A group that presents a variety of threats that target web applications through ports, emails and scripts.
Backdoor	2329	A technique that aims to bypass security mechanisms by replying to specific constructed client applications.
DoS	16353	Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Exploits	44525	Are sequences of commands controlling the behaviour of a host through a known vulnerability
Generic	215481	A method that targets cryptography and causes a collision with each block-cipher.
Reconnaissance	13987	A technique for gathering information about a network host and is also known as a probe.
Shellcode	1511	A malware that penetrates a code to control a victim's host.
Worms	174	Attacks that replicate themselves and spread to other computers.

CIC-ToN-IoT

Please click here to download the dataset.

We utilised the publicly available pcaps of the ToN-IoT dataset to generate its CICFlowMeter records, leading to a CICFlowMeter-based IoT network dataset called CIC-ToN-IoT. The total number of data flows is 1,846,373 out of which 461,934 (25.02%) are attack samples and 1,384,439 (74.98%) are benign ones, the table below lists and defines the distribution of the CIC-ToN-IoT dataset.

Class	Count	Description
Benign	1384439	Normal unmalicious flows
Backdoor	19126	A technique that aims to attack remote-access computers by replying to specific constructed client applications.
DoS	19243	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	193252	An attempt similar to DoS but has multiple different distributed sources.
Injection	72534	A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones.
MITM	1348	Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications.
Password	15277	Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing.
Ransomware	149	An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key.
Scanning	105699	A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing.
XSS	25306	Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users.

CIC-BoT-IoT

Please click here to download the dataset.

An IoT CICFlowMeter-based dataset was generated using the BoT-IoT dataset, named CIC-BoT-IoT. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 3,668,045 out of which 3,661,535 (99.82%) are attack samples and 6,510 (0.18%) are benign. There are four attack categories in the dataset, the table below represents the CIC-BoT-IoT distribution of all flows.

Class	Count	Description
Benign	6510	Normal unmalicious flows
Reconnaissance	733197	A technique for gathering information about a network host and is also known as a probe.
DDoS	1465678	Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources.
DoS	1462659	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
Theft	1	A group of attacks that aims to obtain sensitive data such as data theft and keylogging

CIC-CSE-CIC-IDS2018

Please click here to download the dataset.

We utilised the original pcap files of the CSE-CIC-IDS2018 dataset to generate a CICFlowMeter-based dataset called CIC-CSE-CIC-IDS2018. The total number of flows is 19,177,873 out of which 2,708,742 (14.13%) are attack samples and 16,469,131 (85.87%) are benign ones, the table below represents the dataset's distribution.

Class	Count	Description
Benign	16469131	Normal unmalicious flows
BruteForce	380374	A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities
Bot	286191	An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities.
DoS	653077	An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data.
DDoS	1387098	An attempt similar to DoS but has multiple different distributed sources.
Infiltration	161	An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities
Web Attacks	1841	A group that includes SQL injections, command injections and unrestricted file uploads

CIC-UQ-NIDS

Please click here to download the dataset.

A comprehensive dataset, merging all the aforementioned datasets. The newly published dataset represents the benefits of shared dataset feature sets, where the merging of multiple smaller ones is possible. This will eventually lead to a bigger and more universal NIDS dataset containing flows from multiple network setups and different attack settings. An additional label feature identifies the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different test-bed networks. The attack categories have been modified to combine all parent categories. Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDOS attack-LOIC-UDP, DDOS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category. The CIC-UQ-NIDS dataset has a total of 27,232,335 records, out of which 19,040,124 (69.92%) are benign flows and 8,192,211 (30.08%) attacks. The table below lists the distribution of the final attack categories.

Class	Count
Benign	19040124
DDoS	3046028
Reconnaissance	747184
Injection	72534
DoS	2131032
Brute Force	382215
Password	15277
XSS	25306
Infilteration	161
Exploits	44525
Scanning	105699
Fuzzers	24246
Backdoor	19126
Bot	286191
Generic	215481
Analysis	2677
Theft	1
Shellcode	1511
MITM	1348
Worms	174
Ransomware	149
Web Attacks	1841

You are here

Machine Learning-Based NIDS Datasets

NetFlow v3 Datasets

NF-UNSW-NB15-v3

NF-ToN-IoT-v3

NF-BoT-IoT-v3

NF-CSE-CIC-IDS2018-v3

NetFlow V2 Datasets

NF-UNSW-NB15-v2

NF-ToN-IoT-v2

NF-BoT-IoT-v2

NF-CSE-CIC-IDS2018-v2

NF-UQ-NIDS-v2

NetFlow V1 Datasets

NF-UNSW-NB15

NF-ToN-IoT

NF-BoT-IoT

NF-CSE-CIC-IDS2018

NF-UQ-NIDS

CICFlowMeter Datasets

CIC-UNSW-NB15

CIC-ToN-IoT

CIC-BoT-IoT

CIC-CSE-CIC-IDS2018

CIC-UQ-NIDS

A Member of

Quick Links

Social Media

Explore