Calculating the measures of semantic connectivity based on wiki projects
Naikhanova Larisa Vladimirovna

Doctor of Technical Science

Professor, Department of Informatics Systems, East-Siberian State University of Technology and Management

670013, Russia, respublika Buryatiya, g. Ulan-Ude, ul. Klyuchevskaya, 40V



Naikhanov Nikolai Vladimirovich

graduate student, Department of Informatics Systems of the East-Siberian State University of Technology and Management

670013, Russia, Buryatiya, g. Ulan-Ude, ul. Klyuchevskaya, 40V





The object of study of this work is a measure of semantic connectivity of texts, the subject of study is an algorithm for calculating the measure of semantic connectivity of texts. The article focuses on the method for determining the hybrid measure of the semantic connectivity of two concepts. This method underlies the algorithm for computing the similarity of two texts. Wiki projects (Wikipedia and Wiktionary) are used as sources of knowledge. Sharing them allows covering a much larger number of words as compared to using one of the wiki projects. The method uses the well-known Wikisim measure. This measure is simple, but has good performance. In the classic Wikisim method, only Wikipedia is used, so it is adapted for Wiktionary. The methodology of the work is based on modeling the process of extracting high-quality knowledge from Wiki projects - the individual work of independent volunteers. The novelty of the research lies in combining the two sources of knowledge of Wikipedia and Wiktionary and creating on their basis a new hybrid measure of semantic relatedness of concepts. The main conclusion of the work is that the combination of formal (Wiktionary) and informal (Wikipedia) sources of knowledge can lead to a better assessment of semantic connectivity between text units. The described method can be applied in economics, sociology and politics to clarify people's opinions on issues of interest.

Keywords: text, Wiktionary, Wikipedia, WikiProject, knowledge source, relatedness hybrid, semantic relatedness, word, concept, frequency response



Article was received:


Review date:


Publish date:


This article written in Russian. You can find full text of article in Russian here .

