数据挖掘中分类算法的可扩展性研究
Studies on Scalability of Classification Algorithm in Data mining
-
摘要: 分类是数据挖掘中最重要的技术之一,而且应用领域非常广泛,但面对新出现的海量数据,目前已有的许多分类算法不具备良好的伸缩性,不能从巨大的数据集中快速而准确地发现有用的知识.针对这一问题,本文对分类算法的可扩展性方法进行了深入的研究,并对各种方法进行了分析和对比,从而便于研究和开发者对已有的算法进行改进和扩展,以适应数据挖掘技术的不断发展.Abstract: Classification is one of the most important techniques in data mining, it has been largely applied in many fields.However,as the appearance of large amount of data,the existing classification algorithms do not scale well,they cannot mine useful knowledge from very large datasets quickly and accurately.This paper discusses scalability,gives an analysis and comparison of these algorithms.So that researchers and practitioners can scale up their existing algorithms to catch up with the development of data minin...