Collective Data Anomaly Detection Based on Reverse k-Nearest Neighbor Filtering

oleh: WU Jin’e, WANG Ruoyu, DUAN Qianqian, LI Guoqiang, JÜ Changjiang

Format: Article
Diterbitkan: Editorial Office of Journal of Shanghai Jiao Tong University 2021-05-01

Deskripsi

Aimed at the problem of group data anomaly detection with no data labels, a k-nearest neighbor (kNN) algorithm is proposed to detect group data anomalies in the unsupervised mode. In order to reduce false negatives and false positives caused by the mutual interference between abnormal and normal values, a reverse k-nearest neighbor (RkNN) method is proposed to filter the abnormal group data in reverse. First, the RkNN algorithm uses statistical distance as the similarity measure between different groups of data. Then, the anomaly scores of each group and the initial abnormality are obtained by using the kNN algorithm. Finally, the initial abnormality is filtered by using the RkNN method. The experiment results show that the algorithm proposed can not only effectively reduce the false negatives and false positives, but also has a high anomaly detection rate and good stability.