基于逆K最近邻的密度峰值异常检测方法

Density Peak Anomaly Detection Method Based on Rknn

  • 摘要: 为提升异常检测算法在处理局部异常、异常簇和复杂分布数据集时的检测精度,降低对数据先验信息的依赖性,提出一种基于逆K最近邻的密度峰值异常检测方法(Rknn-DP).首先结合逆K最近邻(Rknn)改进密度峰值算法中局部密度和相对距离的计算方式,通过引入邻域信息更准确地刻画异常点的特征,然后根据特征分布选取局部密度低、相对距离高的点作为粗选异常点集合,最后通过逆K最近邻计算粗选集合的异常因子,根据异常程度进行剪枝,排除噪声点、降低连带错误效应,自适应得到最终的异常点集.通过与ABOD、LSCP、HBOS、IForest等算法在真实数据集与人工数据集上的对比实验,证明了Rknn-DP算法的自适应性和有效性.

     

    Abstract: In order to improve the detection accuracy of anomaly detection algorithms and reduce the dependence on data prior information when dealing with local anomalies, anomaly clusters and complex distribution data sets, a density peak anomaly detection method Rknn-DP based on inverse K nearest neighbors is proposed. First of all, the algorithm improved the calculation of local density and relative distance in the density peak algorithm by Rknn, to make it more accurately describe the characteristics of the abnormal points. After that, select points with low local density and high relative distance by adaptive threshold, as rough set of abnormal points. Finally, the Rknn method is used to prune the rough selection set, eliminate the noise point, reduce the associated error effect, and adaptively get the final abnormal set. Compared with ABOD, LSCP, HBOS, IForest algorithms in real data sets and artificial data sets, the results show that the Rknn-DP, algorithm performs with higher detection and adaptability.

     

/

返回文章
返回