A K-means algorithm based on characteristics of density applied to network intrusion detection
- Shanghai Maritime University
Shanghai, 201306, China - Providence University
Taichung 43301, Taiwan - Arkansas State University
Jonesboro, Arkansas 72467, USA
Abstract
K-means algorithms are a group of popular unsupervised algorithms widely used for cluster analysis. However, the results of traditional K-means clustering algorithms are greatly affected by the initial clustering center, with unstable accuracy and low speed, which makes the algorithm hard to meet the requirements for Big Data. In this paper, a modernized version of the K-means algorithm based on density to select the initial seed of clustering is proposed. Firstly, Kd-tree is used to divide the hyper-rectangle space, so those points close to each other are grouped into the same sub-tree during data pre-processing, and the generalized information is stored in the tree structure. Besides, an improved Kd-tree nearest neighbor search is used in the K-means algorithm to prune the search space and optimize the operation for speedup. The clustering results show that the clusters are stable and accurate when the numbers of clusters and iterations are constant. Experimental results in the network intrusion detection case show that the improved version of the K-means algorithms performs better in terms of detection rate and false rate.
Key words
Network security; K-means; Kd-tree; Network intrusion detection
Digital Object Identifier (DOI)
https://doi.org/10.2298/CSIS200406014X
Publication information
Volume 17, Issue 2 (June 2020)
Year of Publication: 2020
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium
Full text
Available in PDF
Portable Document Format
How to cite
Xu, J., Han, D., Li, K., Jiang, H.: A K-means algorithm based on characteristics of density applied to network intrusion detection. Computer Science and Information Systems, Vol. 17, No. 2, 665–687. (2020), https://doi.org/10.2298/CSIS200406014X