Version 3 of the NetFlow datasets is made up of 53 extended NetFlow features. The details of the datasets are published in:
Majed Luay, Siamak Layeghy, Seyedehfaezeh Hosseininoorbin, Mohanad Sarhan, Nour Moustafa, Marius Portmann, ''Temporal Analysis of NetFlow Datasets for Network Intrusion Detection Systems''. Use the following citation to reference these datasets:
Please click here to download the dataset.
The NF-UNSW-NB15-v3 dataset is a NetFlow-based version of the well-known UNSW-NB15 dataset, enhanced with additional NetFlow features and labelled according to its respective attack categories. It consists of a total of 2,365,424 data flows, where 127,639 (5.4%) are attack samples and 2,237,731 (94.6%) are benign. The attack flows are categorised into nine classes, each representing a distinct cyber threat. The table below provides a detailed distribution of the dataset:
Class | Count | Description |
---|---|---|
Benign | 2,237,731 | Normal unmalicious flows |
Fuzzers | 33,816 | An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system. |
Analysis | 2,381 | A group that presents a variety of threats that target web applications through ports, emails and scripts. |
Backdoor | 1,226 | A technique that aims to bypass security mechanisms by replying to specific constructed client applications. |
DoS | 5,980 | Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Exploits | 42,748 | Are sequences of commands controlling the behaviour of a host through a known vulnerability |
Generic | 19,651 | A method that targets cryptography and causes a collision with each block-cipher. |
Reconnaissance | 17,074 | A technique for gathering information about a network host and is also known as a probe. |
Shellcode | 4,659 | A malware that penetrates a code to control a victim's host. |
Worms | 158 | Attacks that replicate themselves and spread to other computers. |
Please click here to download the dataset.
The NF-ToN-IoT-v3 dataset is a NetFlow-based version of the well-known ToN-IoT dataset, enhanced with additional NetFlow features and labelled according to its respective attack categories. The total number of data flows is 27,520,260 out of which 10,728,046 (38.98%) are attack samples and 16,792,214 (61.02%) are benign ones. The table below lists and defines the distribution of the NF-ToN-IoT-v3 classes.
Class | Count | Description |
---|---|---|
Benign | 16,792,214 | Normal unmalicious flows |
Backdoor | 203,384 | A technique that aims to attack remote-access computers by replying to specific constructed client applications. |
DoS | 203,456 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 4,141,256 | An attempt similar to DoS but has multiple different distributed sources. |
Injection | 381,777 | A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones. |
MITM | 6,013 | Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications. |
Password | 1,594,777 | Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing. |
Ransomware | 3,971 | An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key. |
Scanning | 1,358,977 | A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing. |
XSS | 2,834,435 | Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users. |
Please click here to download the dataset.
An IoT NetFlow-based dataset was generated by expanding the NF-BoT-IoT dataset. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 16,993,808 out of which 16,881,819 (99.7%) are attack samples and 51,989 (0.3%) are benign. There are four attack categories in the dataset, the table below represents the class distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 51,989 | Normal unmalicious flows |
Reconnaissance | 1,695,132 | A technique for gathering information about a network host and is also known as a probe. |
DDoS | 7,150,882 | Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources. |
DoS | 8,034,190 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Theft | 1,651 | A group of attacks that aims to obtain sensitive data such as data theft and keylogging |
Please click here to download the dataset.
The original pcap files of the CSE-CIC-IDS2018 dataset are utilised to generate a NetFlow-based dataset called NF-CSE-CIC-IDS2018. The total number of flows is 20,115,529 out of which 2,600,903 (12.93%) are attack samples and 17,514,626 (87.07%) are benign ones, the table below represents the dataset's distribution.
Class | Count | Description |
---|---|---|
Benign | 17,514,626 | Normal unmalicious flows |
BruteForce | 575,194 | A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities |
Bot | 207,703 | An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities. |
DoS | 302,966 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 1,324,350 | An attempt similar to DoS but has multiple different distributed sources. |
Infiltration | 188,152 | An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities |
Web Attacks | 2,538 | A group that includes SQL injections, command injections and unrestricted file uploads |
Version 2 of the datasets is made up of 43 extended NetFlow features explained here.
Please click here to download the datasets in CSV format. The details of the datasets are published in:
Mohanad Sarhan, Siamak Layeghy, and Marius Portmann, Towards a Standard Feature Set for Network Intrusion Detection System Datasets, Mobile Networks and Applications, 103, 108379, 2022. https://doi.org/10.1007/s11036-021-01843-0
Please click here to download the dataset.
The NetFlow-based format of the UNSW-NB15 dataset, named NF-UNSW-NB15, has been expanded with additional NetFlow features and labelled with its respective attack categories. The total number of data flows is 2,390,275 out of which 95,053 (3.98%) are attack samples and 2,295,222 (96.02%) are benign. The attack samples are further classified into nine subcategories, the table below represents the NF-UNSW-NB15-v2 dataset's distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 2295222 | Normal unmalicious flows |
Fuzzers | 22310 | An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system. |
Analysis | 2299 | A group that presents a variety of threats that target web applications through ports, emails and scripts. |
Backdoor | 2169 | A technique that aims to bypass security mechanisms by replying to specific constructed client applications. |
DoS | 5794 | Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Exploits | 31551 | Are sequences of commands controlling the behaviour of a host through a known vulnerability |
Generic | 16560 | A method that targets cryptography and causes a collision with each block-cipher. |
Reconnaissance | 12779 | A technique for gathering information about a network host and is also known as a probe. |
Shellcode | 1427 | A malware that penetrates a code to control a victim's host. |
Worms | 164 | Attacks that replicate themselves and spread to other computers. |
Please click here to download the dataset.
The publicly available pcaps of the ToN-IoT dataset are utilised to generate its NetFlow records, leading to a NetFlow-based IoT network dataset called NF-ToN-IoT. The total number of data flows is 16,940,496 out of which 10,841,027 (63.99%) are attack samples and 6,099,469 (36.01%), the table below lists and defines the distribution of the NF-ToN-IoT-v2 dataset.
Class | Count | Description |
---|---|---|
Benign | 6099469 | Normal unmalicious flows |
Backdoor | 16809 | A technique that aims to attack remote-access computers by replying to specific constructed client applications. |
DoS | 712609 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 2026234 | An attempt similar to DoS but has multiple different distributed sources. |
Injection | 684465 | A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones. |
MITM | 7723 | Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications. |
Password | 1153323 | Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing. |
Ransomware | 3425 | An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key. |
Scanning | 3781419 | A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing. |
XSS | 2455020 | Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users. |
Please click here to download the dataset.
An IoT NetFlow-based dataset was generated by expanding the NF-BoT-IoT dataset. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 37,763,497 out of which 37,628,460 (99.64%) are attack samples and 135,037 (0.36%) are benign. There are four attack categories in the dataset, the table below represents the NF-BoT-IoT-v2 distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 135037 | Normal unmalicious flows |
Reconnaissance | 2620999 | A technique for gathering information about a network host and is also known as a probe. |
DDoS | 18331847 | Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources. |
DoS | 16673183 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Theft | 2431 | A group of attacks that aims to obtain sensitive data such as data theft and keylogging |
Please click here to download the dataset.
The original pcap files of the CSE-CIC-IDS2018 dataset are utilised to generate a NetFlow-based dataset called NF-CSE-CIC-IDS2018-v2. The total number of flows is 18,893,708 out of which 2,258,141 (11.95%) are attack samples and 16,635,567 (88.05%) are benign ones, the table below represents the dataset's distribution.
Class | Count | Description |
---|---|---|
Benign | 16635567 | Normal unmalicious flows |
BruteForce | 120912 | A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities |
Bot | 143097 | An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities. |
DoS | 483999 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 1390270 | An attempt similar to DoS but has multiple different distributed sources. |
Infiltration | 116361 | An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities |
Web Attacks | 3502 | A group that includes SQL injections, command injections and unrestricted file uploads |
Please click here to download the dataset.
A comprehensive dataset, merging all the aforementioned datasets. The newly published dataset represents the benefits of the shared dataset feature sets, where the merging of multiple smaller datasets is possible. This will eventually lead to a bigger and a universal NIDS dataset containing flows from multiple network setups and different attack settings. It includes an additional label feature, identifying the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different testbed networks. The attack categories have been modified to combine all parent categories. Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDoS attack-LOIC-UDP, DDoS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category. The NF-UQ-NIDS dataset has a total of 75,987,976 records, out of which 25,165,295 (33.12%) are benign flows and 50,822,681 (66.88%) are attacks. The table below lists the distribution of the final attack categories.
Class | Count |
---|---|
Benign | 25165295 |
DDoS | 21748351 |
Reconnaissance | 2633778 |
Injection | 684897 |
DoS | 17875585 |
Brute Force | 123982 |
Password | 1153323 |
XSS | 2455020 |
Infilteration | 116361 |
Exploits | 31551 |
Scanning | 3781419 |
Fuzzers | 22310 |
Backdoor | 18978 |
Bot | 143097 |
Generic | 16560 |
Analysis | 2299 |
Theft | 2431 |
Shellcode | 1427 |
MITM | 7723 |
Worms | 164 |
Ransomware | 3425 |
Version 1 of the datasets is made up of 8 basic NetFlow features explained here.
Please click here to download the datasets in CSV format. The details of the datasets are published in:
Sarhan M., Layeghy S., Moustafa N., Portmann M. (2021) NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. In: Big Data Technologies and Applications. BDTA 2020, WiCON 2020. Springer, Cham. https://doi.org/10.1007/978-3-030-72802-1_9
Please click here to download the dataset.
The NetFlow-based format of the UNSW-NB15 dataset, named NF-UNSW-NB15, has been developed and labelled with its respective attack categories. The total number of data flows is 1,623,118 out of which 72,406 (4.46%) are attack samples and 1,550,712 (95.54%) are benign. The attack samples are further classified into nine subcategories, The table below represents the NF-UNSW-NB15 dataset's distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 1550712 | Normal unmalicious flows |
Fuzzers | 19463 | An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system. |
Analysis | 1995 | A group that presents a variety of threats that target web applications through ports, emails and scripts. |
Backdoor | 1782 | A technique that aims to bypass security mechanisms by replying to specific constructed client applications. |
DoS | 5051 | Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Exploits | 24736 | Are sequences of commands controlling the behaviour of a host through a known vulnerability |
Generic | 5570 | A method that targets cryptography and causes a collision with each block-cipher. |
Reconnaissance | 12291 | A technique for gathering information about a network host and is also known as a probe. |
Shellcode | 1365 | A malware that penetrates a code to control a victim's host. |
Worms | 153 | Attacks that replicate themselves and spread to other computers. |
Please click here to download the dataset.
We utilised the publicly available pcaps of the ToN-IoT dataset to generate its NetFlow records, leading to a NetFlow-based IoT network dataset called NF-ToN-IoT. The total number of data flows is 1,379,274 out of which 1,108,995 (80.4%) are attack samples and 270,279 (19.6%) are benign ones, the table below lists and defines the distribution of the NF-ToN-IoT dataset.
Class | Count | Description |
---|---|---|
Benign | 270279 | Normal unmalicious flows |
Backdoor | 17247 | A technique that aims to attack remote-access computers by replying to specific constructed client applications. |
DoS | 17717 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 326345 | An attempt similar to DoS but has multiple different distributed sources. |
Injection | 468539 | A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones. |
MITM | 1295 | Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications. |
Password | 156299 | Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing. |
Ransomware | 142 | An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key. |
Scanning | 21467 | A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing. |
XSS | 99944 | Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users. |
Please click here to download the dataset.
An IoT NetFlow-based dataset was generated using the BoT-IoT dataset, named NF-BoT-IoT. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 600,100 out of which 586,241 (97.69%) are attack samples and 13,859 (2.31%) are benign. There are four attack categories in the dataset, the table below represents the NF-BoT-IoT distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 13859 | Normal unmalicious flows |
Reconnaissance | 470655 | A technique for gathering information about a network host and is also known as a probe. |
DDoS | 56844 | Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources. |
DoS | 56833 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Theft | 1909 | A group of attacks that aims to obtain sensitive data such as data theft and keylogging |
Please click here to download the dataset.
We utilised the original pcap files of the CSE-CIC-IDS2018 dataset to generate a NetFlow-based dataset called NF-CSE-CIC-IDS2018. The total number of flows is 8,392,401 out of which 1,019,203 (12.14%) are attack samples and 7,373,198 (87.86%) are benign ones, the table below represents the dataset's distribution.
Class | Count | Description |
---|---|---|
Benign | 7373198 | Normal unmalicious flows |
BruteForce | 287597 | A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities |
Bot | 15683 | An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities. |
DoS | 269361 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 380096 | An attempt similar to DoS but has multiple different distributed sources. |
Infiltration | 62072 | An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities |
Web Attacks | 4394 | A group that includes SQL injections, command injections and unrestricted file uploads |
Please click here to download the dataset.
A comprehensive dataset, merging all the aforementioned datasets. The newly published dataset represents the benefits of shared dataset feature sets, where the merging of multiple smaller ones is possible. This will eventually lead to a bigger and more universal NIDS dataset containing flows from multiple network setups and different attack settings. An additional label feature identifies the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different test-bed networks. The attack categories have been modified to combine all parent categories. Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDOS attack-LOIC-UDP, DDOS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category. The NF-UQ-NIDS dataset has a total of 11,994,893 records, out of which 9,208,048 (76.77%) are benign flows and 2,786,845 (23.23%) attacks. The table below lists the distribution of the final attack categories.
Class | Count |
---|---|
Benign | 9208048 |
DDoS | 763285 |
Reconnaissance | 482946 |
Injection | 468575 |
DoS | 348962 |
Brute Force | 291955 |
Password | 156299 |
XSS | 99944 |
Infilteration | 62072 |
Exploits | 24736 |
Scanning | 21467 |
Fuzzers | 19463 |
Backdoor | 17247 |
Bot | 15683 |
Generic | 5570 |
Analysis | 1995 |
Theft | 1909 |
Shellcode | 1365 |
MITM | 1295 |
Worms | 153 |
Ransomware | 142 |
The CICFlowMeter datasets are made up of 80 network traffic features explained here.
Please click here to download the datasets in CSV format. The details of the datasets are published in:
Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, and Marius Portmann, Feature Evaluation for Machine Learning-based Network Intrusion Detection Systems, 2022.
Please click here to download the dataset.
The CICFlowMeter-based format of the UNSW-NB15 dataset, named CIC-UNSW-NB15, has been developed and labelled with its respective attack categories. The total number of data flows is 2,540,044 out of which 321,283 (12.65%) are attack samples and 2,218,761 (87.35%) are benign. The attack samples are further classified into nine subcategories. The table below represents the CIC-UNSW-NB15 dataset's distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 2218761 | Normal unmalicious flows |
Fuzzers | 24246 | An attack in which the attacker sends large amounts of random data which cause a system to crash and also aim to discover security vulnerabilities in a system. |
Analysis | 2677 | A group that presents a variety of threats that target web applications through ports, emails and scripts. |
Backdoor | 2329 | A technique that aims to bypass security mechanisms by replying to specific constructed client applications. |
DoS | 16353 | Denial of Service is an attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Exploits | 44525 | Are sequences of commands controlling the behaviour of a host through a known vulnerability |
Generic | 215481 | A method that targets cryptography and causes a collision with each block-cipher. |
Reconnaissance | 13987 | A technique for gathering information about a network host and is also known as a probe. |
Shellcode | 1511 | A malware that penetrates a code to control a victim's host. |
Worms | 174 | Attacks that replicate themselves and spread to other computers. |
Please click here to download the dataset.
We utilised the publicly available pcaps of the ToN-IoT dataset to generate its CICFlowMeter records, leading to a CICFlowMeter-based IoT network dataset called CIC-ToN-IoT. The total number of data flows is 1,846,373 out of which 461,934 (25.02%) are attack samples and 1,384,439 (74.98%) are benign ones, the table below lists and defines the distribution of the CIC-ToN-IoT dataset.
Class | Count | Description |
---|---|---|
Benign | 1384439 | Normal unmalicious flows |
Backdoor | 19126 | A technique that aims to attack remote-access computers by replying to specific constructed client applications. |
DoS | 19243 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 193252 | An attempt similar to DoS but has multiple different distributed sources. |
Injection | 72534 | A variety of attacks that supply untrusted inputs that aim to alter the course of execution, with SQL and Code injections two of the main ones. |
MITM | 1348 | Man In The Middle is a method that places an attacker between a victim and host with which the victim is trying to communicate, with the aim of intercepting traffic and communications. |
Password | 15277 | Covers a variety of attacks aimed at retrieving passwords by either brute force or sniffing. |
Ransomware | 149 | An attack that encrypts the files stored on a host and asks for compensation in exchange for the decryption technique/key. |
Scanning | 105699 | A group that consists of a variety of techniques that aim to discover information about networks and hosts, and is also known as probing. |
XSS | 25306 | Cross-site Scripting is a type of injection in which an attacker uses web applications to send malicious scripts to end-users. |
Please click here to download the dataset.
An IoT CICFlowMeter-based dataset was generated using the BoT-IoT dataset, named CIC-BoT-IoT. The features were extracted from the publicly available pcap files and the flows were labelled with their respective attack categories. The total number of data flows is 3,668,045 out of which 3,661,535 (99.82%) are attack samples and 6,510 (0.18%) are benign. There are four attack categories in the dataset, the table below represents the CIC-BoT-IoT distribution of all flows.
Class | Count | Description |
---|---|---|
Benign | 6510 | Normal unmalicious flows |
Reconnaissance | 733197 | A technique for gathering information about a network host and is also known as a probe. |
DDoS | 1465678 | Distributed Denial of Service is an attempt similar to DoS but has multiple different distributed sources. |
DoS | 1462659 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
Theft | 1 | A group of attacks that aims to obtain sensitive data such as data theft and keylogging |
Please click here to download the dataset.
We utilised the original pcap files of the CSE-CIC-IDS2018 dataset to generate a CICFlowMeter-based dataset called CIC-CSE-CIC-IDS2018. The total number of flows is 19,177,873 out of which 2,708,742 (14.13%) are attack samples and 16,469,131 (85.87%) are benign ones, the table below represents the dataset's distribution.
Class | Count | Description |
---|---|---|
Benign | 16469131 | Normal unmalicious flows |
BruteForce | 380374 | A technique that aims to obtain usernames and password credentials by accessing a list of predefined possibilities |
Bot | 286191 | An attack that enables an attacker to remotely control several hijacked computers to perform malicious activities. |
DoS | 653077 | An attempt to overload a computer system's resources with the aim of preventing access to or availability of its data. |
DDoS | 1387098 | An attempt similar to DoS but has multiple different distributed sources. |
Infiltration | 161 | An inside attack that sends a malicious file via an email to exploit an application and is followed by a backdoor that scans the network for other vulnerabilities |
Web Attacks | 1841 | A group that includes SQL injections, command injections and unrestricted file uploads |
Please click here to download the dataset.
A comprehensive dataset, merging all the aforementioned datasets. The newly published dataset represents the benefits of shared dataset feature sets, where the merging of multiple smaller ones is possible. This will eventually lead to a bigger and more universal NIDS dataset containing flows from multiple network setups and different attack settings. An additional label feature identifies the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different test-bed networks. The attack categories have been modified to combine all parent categories. Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDOS attack-LOIC-UDP, DDOS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category. The CIC-UQ-NIDS dataset has a total of 27,232,335 records, out of which 19,040,124 (69.92%) are benign flows and 8,192,211 (30.08%) attacks. The table below lists the distribution of the final attack categories.
Class | Count |
---|---|
Benign | 19040124 |
DDoS | 3046028 |
Reconnaissance | 747184 |
Injection | 72534 |
DoS | 2131032 |
Brute Force | 382215 |
Password | 15277 |
XSS | 25306 |
Infilteration | 161 |
Exploits | 44525 |
Scanning | 105699 |
Fuzzers | 24246 |
Backdoor | 19126 |
Bot | 286191 |
Generic | 215481 |
Analysis | 2677 |
Theft | 1 |
Shellcode | 1511 |
MITM | 1348 |
Worms | 174 |
Ransomware | 149 |
Web Attacks | 1841 |