基于聚类树的多类标文本分类算法研究
Research on Multi-label Text Classification Algorithm Based on Cluster Tree
-
摘要: 提出一种新的多类标分类算法——多类标聚类树算法.该算法利用文本属性特征及类标信息,通过迭代调用"基于类标信息的聚类算法",将两空间分类树的生长不断划分,直至空间足够简单为止.实验证明,提出的多类标聚类树算法总体上优于其他对比算法,其分类能力强于排序能力.Abstract: A new multi-label classification algorithm——Multi-Label Cluster Tree was presented. This algorithm combined the characteristics of transitional single-label cluster tree, and used the "clustering by label information algorithm" to divide the two space recursively until the clusters are sufficiently simple by the feature paces and the clustering information paces. Experimental results proved that the Multi-Label Cluster Tree algorithm outperforms the some existing popular classification algorithm, and the ability of classification is stronger than one of the sorting.