Eng During last 365 days Approved articles: 2075,   Articles in work: 296 Declined articles: 803 

Back to contents

Software systems and computational methods

Calculating the measures of semantic connectivity based on wiki projects
Naikhanova Larisa Vladimirovna

Doctor of Technical Science

Professor, Department of Informatics Systems, East-Siberian State University of Technology and Management

670013, Russia, respublika Buryatiya, g. Ulan-Ude, ul. Klyuchevskaya, 40V



Naikhanov Nikolai Vladimirovich

graduate student, Department of Informatics Systems of the East-Siberian State University of Technology and Management

670013, Russia, Buryatiya, g. Ulan-Ude, ul. Klyuchevskaya, 40V





The object of study of this work is a measure of semantic connectivity of texts, the subject of study is an algorithm for calculating the measure of semantic connectivity of texts. The article focuses on the method for determining the hybrid measure of the semantic connectivity of two concepts. This method underlies the algorithm for computing the similarity of two texts. Wiki projects (Wikipedia and Wiktionary) are used as sources of knowledge. Sharing them allows covering a much larger number of words as compared to using one of the wiki projects. The method uses the well-known Wikisim measure. This measure is simple, but has good performance. In the classic Wikisim method, only Wikipedia is used, so it is adapted for Wiktionary. The methodology of the work is based on modeling the process of extracting high-quality knowledge from Wiki projects - the individual work of independent volunteers. The novelty of the research lies in combining the two sources of knowledge of Wikipedia and Wiktionary and creating on their basis a new hybrid measure of semantic relatedness of concepts. The main conclusion of the work is that the combination of formal (Wiktionary) and informal (Wikipedia) sources of knowledge can lead to a better assessment of semantic connectivity between text units. The described method can be applied in economics, sociology and politics to clarify people's opinions on issues of interest.

Keywords: text, Wiktionary, Wikipedia, WikiProject, knowledge source, relatedness hybrid, semantic relatedness, word, concept, frequency response



Article was received:


Review date:


Publish date:


This article written in Russian. You can find full text of article in Russian here .

Varlamov M.I., Korshunov A.V. Raschet semanticheskoi blizosti kontseptov na osnove kratchaishikh putei v grafe ssylok Vikipedii // Mashinnoe obuchenie i analiz dannykh. 2014. T. 1, 8. S.1107-1125.
Naikhanov N.V., Dyshenov B.A. Opredelenie semanticheskoi blizosti ponyatii na osnove ispol'zovaniya ssylok Vikipedii // Programmnye sistemy i vychislitel'nye metody. 2016.- 3.- S.250-257.
Bollegala D., Matsuo Y., Ishizuka, M. A web search engine-based approach to measure semantic similarity between words // IEEE Transections on Knowledge and Data Engineering. 2011. Vol. 23. 7. P. 977990.
Jabeen Shahida, Gao Xiaoying, Andreae Peter, A Hybrid Model for Learning Semantic Relatedness based on feature extraction from Wikipedia // In 15th International Conference on Web Information System Engineering (WISE2014 Thessaloniki, Greece). 2014. Part I. LNCS 8786. P. 523533.
Resnik P. Using information content to evaluate semantic similarity in a taxonomy // In Proceedings of the 14th International Joint Conference on Artificial Intelligence (AAAI 95 - Montreal, Quebec, Canada). 1995. P. 448453.
Sánchez D., Batet M. Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective // Journal of Biomedical Informatics. 2011. Vol. 44. 5. P. 749759.
Liu H., Bao H., Xu D. Concept vector for semantic similarity and relatedness based on wordnet structure // Journal of Systems and Softwares. 2012. Vol. 85. P. 370381.
Michael S., Ponzetto S. P. Wikirelate! computing semantic relatedness using Wikipedia // In Proceedings of the 21st national conference on Artificial intelligence (AAAI 06 - Boston, Massachusetts). - 2006. - Vol. 2. - P. 1419 1424.
Milne D., Medelyan O., Witten I. H. Mining Domain-Specific thesauri from Wikipedia: A case study // In 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 06 Hong Kong, China). 2006. P. 442448.
Morris J., Hirst G. Non-classical lexical semantic relations // In Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics (CLS 04 - Boston, Massachusetts). - 2004. - P. 4651.
Zesch T., Muller C., Gurevych I. Using wiktionary for computing semantic relatedness // In Proceedings of the 23rd National Conference on Artificial Intelligence (IAAI-08 - Chicago). 2008. Vol. 2. P. 861 866.