Abstract:
Semantic relatedness computation is a critical fundamental issue in data science. It has a wide range of applications in information retrieval and natural language processing. In view of the current limitations of ESA (Explicit Semantic Analysis) algorithm, a feature selection algorithm is presented to filter the explicit semantic features, and the low dimensional semantic space is constructed. On this basis, according to the mapping information of feature concepts in Wikipedia, a semantic relatedness computation method is proposed under low dimensional explicit semantic space. This method can improve the efficiency of ESA in the following relatedness computing process under high dimensional sparse space. Finally, the experimental results demonstrate that the proposed method has a better correlation on Pearson's (
P) and Spearman's (
S) correlation coefficient with the intuitions of human judgments than other related works.