Clustering based semi-supervised machine learning for DDoS attack classification

oleh: Muhammad Aamir, Syed Mustafa Ali Zaidi

Format: Article
Diterbitkan: Elsevier 2021-05-01

Deskripsi

Semi-supervised machine learning can be used for obtaining subsets of unlabeled or partially labeled dataset based on the applicable metrics of dissimilarity. At later stage, the data is completely assigned the labels as per the observed differentiation. This paper provides a clustering based approach to distinguish the data representing flows of network traffic which include both normal and Distributed Denial of Service (DDoS) traffic. The features are taken for victim-end identification of attacks and the work is demonstrated with three features which can be monitored at the target machine. The clustering methods include agglomerative and K-means with feature extraction under Principal Component Analysis (PCA). A voting method is also proposed to label the data and obtain classes to distinguish attacks from normal traffic. After labeling, supervised machine learning algorithms of k-Nearest Neighbors (kNN), Support Vector Machine (SVM) and Random Forest (RF) are applied to obtain the trained models for future classification. The kNN, SVM and RF models in experimental results provide 95%, 92% and 96.66% accuracy scores respectively under optimized parameter tuning within given sets of values. In the end, the scheme is also validated using a subset of benchmark dataset with new vectors of attack.