基于IN算法的剪枝优化算法
Pruning Optimization Algorithm Based on IN Algorithm
-
摘要: 提出一种基于IN算法构造分类器的剪枝优化算法C IN.针对IN算法利用对数似然比统计量进行假设检验存在的统计意义不明确的问题,本文算法在给定层每一节点引入了样本数阈值和属性值阈值的计算,从而保证检验的有效性.给出了算法的理论依据,并且推导出了对数似然比统计量计算公式成立条件.实验表明,该算法能够消减数据维数并且可以从大规模数据集中提取简明的规则.Abstract: This paper proposed a novel algorithm termed as CIN for classification based on IN(information-theoretic network)algorithm.Aim at ignorance of statistical significance in statistical hypotesis testing by means of the log likelihood ratio in IN algorithm,the CIN algorithm in troduces the threshold of the number of records in each node of given layer so as to guarantee reliability of testing.At the same time,the theoretic basis of the algorithm is given and precondition for the validity of the log likelihood ...