基于特征选择算法的DBN‑SVM胃癌生存期分类方法

A feature selection algorithm based on DBN⁃SVM classification for gastric cancer survival

  • 摘要: 为降低数据集的维度,筛选最优特征子集以提高胃癌预后生存期分类的准确率,提出一种融合特征选择算法的深度置信网络⁃支持向量机混合模型。该模型在过滤式特征选择的基础上,引入距离系数以调整整体偏移度,减少权重计算的不稳定性,从而构建新的样本权重值。在此基础上,通过Pearson相关系数分析,筛选出对胃癌生存期具有显著影响的特征子集;采用深度置信网络的受限玻尔兹曼机模块,对隐藏层中的特征子集进行特征提取;采用支持向量机,对深度置信网络的最终输出进行分类,以实现胃癌生存期的预测。通过对特征选择算法进行改进,并融合深度置信网络和支持向量机的优势,与传统单一的机器学习方法相比,该模型展现出更优的性能,其分类准确率、AUC值及F1值分别达到81.2%、83.4%和81.5%。

     

    Abstract: In order to reduce the dimensionality of the dataset to obtain the best feature subset as well as to improve the accuracy of the prognostic survival classification of gastric cancer, a hybrid network model of deep belief network and support vector machine combined with feature selection algorithm was proposed. Based on the filtered feature selection algorithm, a distance coefficient was introduced to adjust the overall degree of bias and reduce the instability of the calculated weight values, so as to construct new sample weight values, and then analyze the subset of features that have a greater impact on the survival period of gastric cancer through the Pearson’s correlation coefficient; The constrained Boltzmann machine module was adopted in the deep belief network, and then the subset of features in the hidden layer was subjected to the feature extraction; Finally, the support vector machine was used to classify the output values of the last layer of the deep belief network to realize the classification of gastric cancer survival. By improving the feature selection algorithm and combining the advantages of deep belief network and support vector machine, the model showed better accuracy, AUC value and F1 value in the experiments, which are 81.2%, 83.4% and 81.5%, respectively, compared with the traditional single machine learning method.

     

/

返回文章
返回