Software systems and computational methodsReference:
Calculating the measures of semantic connectivity based on wiki projects
Abstract.The object of study of this work is a measure of semantic connectivity of texts, the subject of study is an algorithm for calculating the measure of semantic connectivity of texts. The article focuses on the method for determining the hybrid measure of the semantic connectivity of two concepts. This method underlies the algorithm for computing the similarity of two texts. Wiki projects (Wikipedia and Wiktionary) are used as sources of knowledge. Sharing them allows covering a much larger number of words as compared to using one of the wiki projects. The method uses the well-known Wikisim measure. This measure is simple, but has good performance. In the classic Wikisim method, only Wikipedia is used, so it is adapted for Wiktionary. The methodology of the work is based on modeling the process of extracting high-quality knowledge from Wiki projects - the individual work of independent volunteers. The novelty of the research lies in combining the two sources of knowledge of Wikipedia and Wiktionary and creating on their basis a new hybrid measure of semantic relatedness of concepts. The main conclusion of the work is that the combination of formal (Wiktionary) and informal (Wikipedia) sources of knowledge can lead to a better assessment of semantic connectivity between text units. The described method can be applied in economics, sociology and politics to clarify people's opinions on issues of interest.
Keywords: text, Wiktionary, Wikipedia, WikiProject, knowledge source, relatedness hybrid, semantic relatedness, word, concept, frequency response
Article was received:01-08-2018
This article written in Russian. You can find full text of article in Russian here .
The journal allows the author(s) to hold the copyright without restrictions. All authors automatically own full copyright in their work as soon as they create it, and current Russian Federal legislation protects them.
Licence type: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
The journal is an open access journal which means that everybody can read, download, copy, distribute, print, search, or link to the full texts of these articles in accordance with Creative Commons Attribution- NonCommercial 4.0 International License.
You are free to:
Share — copy and redistribute the material in any medium or format.
Adapt — remix, transform, and build upon the material The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.