通信市场网--中国通信市场(CTM)权威信息发布门户网站

                               （1.国防科技大学计算机学院，湖南省长沙市 410000；
                                      2.国防科技大学计算机学院，湖南省长沙市 410000；
                                      3.国防科技大学计算机学院，湖南省长沙市 410000）

[摘要]：领域词义关联是在语义上对特定领域内的语言进行基本单位关联性计算，而且作为关联性计算的基础，在其他级别的文本间关系度量中发挥着非常重要的作用。领域词义关联的研究具有独特的语言特征，依据维基百科和百度百科不仅对每个词条有详细的解释说明，还对说明中相关属性词条进行链接，可以准确推测特征词的有效语言特征，利用深度学习算法充分挖掘与利用词汇属性之间的关联性。
[关键词]：词义关联，深度学习，维基百科
中图分类号：TP391.1 文献标识码：A 文章编号：

A Research Based on Domain Meaning Relevance

WANG Zhongzhen1 WANG Tao2 DU Xiaoli3
　　（1. National University of Defense Technology，Changsha 410000，china. WANG Zhongzhen，54696661@qq.com
　　2.National University of Defense Technology，Changsha 410000，china. WANG Tao，631570216@qq.com
　　3.National University of Defense Technology，Changsha 410000，china. Du Xiaoli，821979047@qq.com）

Abstract：Domain Meaning Relevance (DMR) refers to computation of the semantic relevance of basic language units in a specific domain. As the fundamental for relevance computation, DMR plays a significant role in measurement of relevance between texts of different levels. Research of DMR has particular Linguistic characteristics. Wikipedia and Baidupedia give not only detailed explanation for each entry, but also links to entries of similar attributes. By this advantage, we can use deep learning algorithm to exploit mining and utilizing the relevance between attributes, and infer the linguistic features of Feature Words.
Key words：Semantic Relevance, Deep Learning，Wikipedia

参考文献：
[1]高勇.啤酒与尿布（神奇的购物篮分析）[M].清华大学出版社.2008,11.1-3
[2]高飞.基于维基百科的汉语词语及短文本相关度计算方法研究[D].杭州电子科技大学.2012,12,1
[3] Resnik P.Using information content to evaluate semantic similarity in a taxonomy[C]//Proceedings of the 14th International Joint Conference on Artificial Intelligence volume 1,Montreal,Canada,August 1995:448-453
[4]刘宏哲,须德. 基于本体的语义相似度和相关度计算研究综述[J].计算机科学.2012,2.8-15
[5]涂新辉,张红春.中文维基百科的结构化信息抽取及词语相关度计算方法[J].中文信息学报.2012,5.109-115
[6] Alex Krizhevsky,Ilya Sutskever,Geoffrey E Hinton.ImageNet Classification with Deep Convolutional Neural Networks[J], NIPS 2012.
[7] Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun.Learning Hierarchical Features for Scene Labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence.2013.
[8] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean.Efficient Estimation of Word Representations in Vector Space [J].arXiv:1301.3781[cs.CL]. 2013.
[9] Y. Bengio, R. Ducharme, P. Vincent. A neural probabilistic language model. Journal of Machine Learning Research . 2003 . 1137-1155
[10] A. Mnih, G. Hinton. A Scalable Hierarchical Distributed Language Model. Advances in Neural Information Processing Systems 21, MIT Press, 2009.
[11] T. Mikolov, M. Karafiat, L. Burget, J. Cernocky, S. Khudanpur. Recurrent neural network based language model, In: Proceedings of Interspeech, 2010.
[12] Y. Bengio, Y. LeCun. Scaling learning algorithms towards AI. In: Large-Scale Kernel Machines,MIT Press, 2007.
[13] T. Mikolov. Language Modeling for Speech Recognition in Czech, Masters thesis, Brno University of Technology, 2007.
[14] T. Mikolov, J. Kopecky, L. Burget, O. Glembek and J. Cernocky. Neural network based language models for higly inflective languages, In: Proc. ICASSP 2009.
[15] Frederic Morin, Yoshua Bengio. Hierarchical Probabilistic Neural Network Language Model. Society for Artificial Intelligence and Statistics[J].2005.246-252
[16]Bengio Y, Ducharme R, and Vincent P. A neural probabilistic language model. In Leen, T, Dietterich, T., and Tresp, V., editors, Advances in Neural Information Processing Systems 13 (NIPS’00). MIT Press.2001. 933–938
[17]Hinton, G. E. Training products of experts by minimizing contrastive divergence. Technical Report GCNU TR 2000-004, Gatsby Unit, University College London. 2000
[18]Goodman, J. Classes for fast maximum entropy training. In International Conference on Acoustics,Speech and Signal Processing (ICASSP), Utah. 2001
[19] Bengio Y, Ducharme R, Vincent P, and Jauvin C. A neural probabilistic language model. Journal of Machine Learning Research, 3. 2003.1137–1155.
[20] Deerwester S.Dumais ST,Landaucr T K et a1.Indexing by latent semantic analysis.Journal of the Society for Information Science.1990.41(6).391-407
[21] David M. Blei, AndrewY. Ng, Michael I. Jordan.LatentDirichlet Allocation.Journal of Machine Learning Research 3. 2003.993-1022

作者简介：
　　王忠振，男，1980年6月出生，北京昌平人，国防科学技术大学计算机学院计算机科学与技术专业工程硕士。主要研究方向为数据挖掘、自然语言处理和信息安全。
　　王涛，男，1979年9月出生，河南长葛人，国防科学技术大学计算机学院计算机科学与技术专业工程硕士。主要研究方向为数据挖掘、微博意见领袖和舆情控制。
　　杜晓莉，女，1989年12月生，河北石家庄人，国防科学技术大学分布与并行处理国家重点实验室。主要研究方向为社会网络与移动计算、移动无线通信。
　　

《通信市场》中国·北京·复兴路49号通信市场(100036) 点击查看具体位置
电话:86-10-6820 7724, 6820 7726

京ICP备05037146号-8

◆建议使用 Microsoft IE4.0 以上版本 800*600浏览如果您有什么建议和意见请与管理员联系