A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks
Imtiaz Ullah and Qusay H. Mahmoud
33rd Canadian Conference on Artificial Intelligence 12-15 May 2020
Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd - - PowerPoint PPT Presentation
A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd Canadian Conference on Artificial Intelligence 12-15 May 2020 Agenda Introduction Motivation Problem Statement
33rd Canadian Conference on Artificial Intelligence 12-15 May 2020
[2]
[3]
[4]
[5]
destructive cyber-attacks.
[6]
intrusion-dataset
[7]
[8]
[9]
[10]
Wi-Fi camera to generate the IoTID20 dataset.
IoT victim devices, and all other devices in the testbed are the attacking devices.
[11]
activity detection in IoT networks.
families.
[12] Binary Category Subcategory Normal, Anomaly Normal DoS, Mirai, MITM, Scan Normal, Syn Flooding, Brute Force, HTTP Flooding, UDP Flooding ARP Spoofing Host Port, OS Table 1. Binary, Category, and Sub-Category of IoTID20 Dataset
[13]
Accuracy = TP+TN TP+TN+FP+FN Precision = TP TP + FP Recall = TP TP + FN F − measure = 2 Precision. Recall Precision + Recall
[14] Total Featur es Feature Name 12 Active_Max, Bwd_IAT_Max, Bwd_Seg_Size_Avg, Fwd_IAT_Max, Fwd_Seg_Size_Avg, Idl e_Max, PSH_Flag_Cnt, Pkt_Size_Avg, Subflow_Bwd_Byts, SubflowBwd_Pkts, Subflow_F wd_Byts, Subflow_Fwd_Pkts Table 2. IoTID20 Dataset Correlated Features
[15]
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Flow_ID Src_IP Src_Port Dst_IP Dst_Port Protocol Timestamp Flow_Duration Tot_Fwd_Pkts Tot_Bwd_Pkts TotLen_Fwd_Pkts TotLen_Bwd_Pkts Fwd_Pkt_Len_Max Fwd_Pkt_Len_Min Fwd_Pkt_Len_Mean Fwd_Pkt_Len_Std Bwd_Pkt_Len_Max Bwd_Pkt_Len_Min Bwd_Pkt_Len_Mean Bwd_Pkt_Len_Std Flow_Byts/s Flow_Pkts/s Flow_IAT_Mean Flow_IAT_Std Flow_IAT_Max Flow_IAT_Min Fwd_IAT_Tot Fwd_IAT_Mean Bwd_IAT_Mean Fwd_IAT_Max Fwd_IAT_Min Bwd_IAT_Tot Bwd_IAT_Mean Bwd_IAT_Std Bwd_IAT_Max Bwd_IAT_Min Fwd_PSH_Flags Bwd_PSH_Flags Fwd_URG_Flags Bwd_URG_Flags Fwd_Header_Len Bwd_Header_Len Fwd_Pkts/s Bwd_Pkts/s Pkt_Len_Min Pkt_Len_Max Pkt_Len_Mean Pkt_Len_Std Pkt_Len_Var FIN_Flag_Cnt SYN_Flag_Cnt RST_Flag_Cnt PSH_Flag_Cnt ACK_Flag_Cnt URG_Flag_Cnt CWE_Flag_Count ECE_Flag_Cnt Down/Up_Ratio Pkt_Size_Avg Fwd_Seg_Size_Avg Bwd_Seg_Size_Avg Fwd_Byts/b_Avg Fwd_Pkts/b_Avg Fwd_Blk_Rate_Avg Bwd_Byts/b_Avg Bwd_Pkts/b_Avg Bwd_Blk_Rate_Avg Subflow_Fwd_Pkts Subflow_Fwd_Byts Subflow_Bwd_Pkts Subflow_Bwd_Byts Init_Fwd_Win_Byts Init_Bwd_Win_Byts Fwd_Act_Data_Pkts Fwd_Seg_Size_Min Active_Mean Active_Std Active_Max Active_Min Idle_Mean Idle_Std Idle_Max Idle_Min
Ranking Score
Features
[16]
the algorithm.
50 60 70 80 90 100 32000 40000 48000 56000 64000 72000 80000 88000 96000 102000 Training-Binary Testing-Binary
[17]
50 55 60 65 70 75 80 85 90 95 100 32000 40000 48000 56000 64000 72000 80000 88000 96000 102000 Training-Category Testing-Category
[18]
50 55 60 65 70 75 80 85 90 95 100 32000 40000 48000 56000 64000 72000 80000 88000 96000 102000 Training-Sub-Category Testing-Sub-Category
[19]
50 55 60 65 70 75 80 85 90 95 100 1 2 3 4 5 6 7 8 9 10 Training-Binary Testing-Binary
[20]
50 55 60 65 70 75 80 85 90 95 100 1 2 3 4 5 6 7 8 9 10 Training-Category Testing-Category
[21]
50 55 60 65 70 75 80 85 90 95 100 1 2 3 4 5 6 7 8 9 10 Training-Sub-Category Testing-Sub-Category
[22]
10 20 30 40 50 60 70 80 90 100
SVM Gaussian NB LDA Logistic Regression Decision Tree Random Forest Ensemble
F-Score Normal Anomaly
malicious network traffic.
poorly performed for binary label classification.
performed very well for binary label classification.
[23]
10 20 30 40 50 60 70 80 90 100
SVM Gaussian NB LDA Logistic Regression Decision Tree Random Forest Ensemble
F-Score Normal DoS Mirai MITM Scan
traffic or any of the following attack category DoS, Mirai, MITM, or Scan.
Gaussian NB, and SVM.
[24]
10 20 30 40 50 60 70 80 90 100 SVM Gaussian NB LDA Logistic Regression Decision Tree Random Forest Ensemble
F-Score
Normal DoS-Synflooding Mirai-Ackflooding Mirai-Hostbruteforce Mirai-HTTP Flooding Mirai-UDP Flooding MITM ARP Spoofing Scan Hostport Scan Port
network traffic or any one of the categories, as shown in Figure 3.
the decision tree classifier for the subcategories.
[25]
Algorithm Accuracy Precision Recall F Score SVM 40 55 37 16 Gaussian NB 73 70 66 62 LDA 70 71 71 70 Logistic Regression 40 25 39 30 Decision Tree 88 88 88 88 Random Forest 84 85 84 84 Ensemble 87 87 87 87
Table 3. IoTID20 Dataset Performance Results
[26]
[27]
In the future, we plan to develop and evaluate a framework for anomalous activity detection models for IoT networks to improve accuracy.
{imtiaz.ullah, qusay.mahmoud}@ontariotechu.net