Eng During last 365 days Approved articles: 2002,   Articles in work: 345 Declined articles: 795 
Library
Articles and journals | Tariffs | Payments | Your profile

Back to contents

A study of the applicability of LSTM recurrent networks in the task of searching for social network experts
Banokin Pavel Ivanovich

Assistant, Tomsk Polytechnic University

634028, Russia, Tomskaya oblast', g. Tomsk, ul. Lenina, 2, of. 103a

pavel805@gmail.com
Efremov Aleksandr Aleksandrovich

Assistant, Tomsk Polytechnic University

634028, Russia, Tomskaya oblast', g. Tomsk, ul. Lenina, 2, of. 115a

alexyefremov@tpu.ru
Luneva Elena Evgenevna

PhD in Technical Science

Associate Professor, Tomsk Polytechnic University

634028, Russia, Tomskaya oblast', g. Tomsk, ul. Lenina, 2, of. 115a

lee@tpu.ru
Kochegurova Elena Alekseevna

PhD in Technical Science

Associate Professor, Tomsk Polytechnic University

634028, Russia, Tomskaya oblast', g. Tomsk, ul. Lenina, 2, of. 112a

kocheg@mail.ru

Abstract.

The article explores the applicability of long short-term memory (LSTM) recurrent networks for the binary classification of text messages of the social network Twitter. A three-stage classification process has been designed, allowing a separate analysis of pictograms and verification of the text for neutrality. The accuracy of the classification of the emotional polarity of text messages using the LSTM network and vector representations of words was verified. The percentage of coincidences of vector representations of words with a training set of data is determined, which makes it possible to obtain an acceptable classification accuracy. The estimation of the learning speed of the LSTM network and the use of memory was carried out. To solve the task of classifying text messages, methods of processing natural language and machine learning using precedents are applied. The algorithmic base for processing text data from social networks, obtained as a result of the application of LSTM neural networks, has been optimized. The novelty of the proposed solution method is due to the implementation of pre-processing of messages, which allows to improve the accuracy of classification, and the use of the neural network configuration taking into account the specifics of text data of social networks.

Keywords: Twitter, word embeddings, social networks, LSTM networks, sentiment analysis, natural language processing, recurrent neural networks, text data preprocessing, reccurent network, binary classification

DOI:

10.7256/2454-0714.2017.4.24655

Article was received:

09-11-2017


Review date:

20-11-2017


Publish date:

11-01-2018


This article written in Russian. You can find full text of article in Russian here .

References
1.
Perkins J. Python 3 Text Processing with NLTK 3 Cookbook.-Birmingham, UK: Packt Publishing Ltd, 2014 .-304 s.
2.
Luneva E.E., Efremov A.A., Banokin P.I. Sposob otsenki emotsiĭ pol'zovateleĭ s ispol'zovaniem nechetkoĭ logiki na primere sotsial'noĭ seti Twitter // Sistemy upravleniya i informatsionnye tekhnologii. Voronezh, Izd-vo OOO Nauchnoe izdatel'stvo Nauchnaya kniga, 2015.-No1.1(59), s. 157-162.
3.
The Stanford Parser: A statistical parser // The Stanford Natural Language Processing Group URL: https://nlp.stanford.edu/software/lex-parser.shtml (data obrashcheniya: 10.10.2017).
4.
Mozetic I, Grcar M, Smailovic J. . Perc M. Multilingual Twitter Sentiment Classification: The Role of Human Annotators // PLoS ONE.-2016.-11(5).
5.
Kim Y. Convolutional Neural Networks for Sentence Classification // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.-Stroudsburg, USA: Association for Computational Linguistics, 2014.-S. 1746-1752.
6.
Dos Santos C. N., Gatti M. Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts //COLING. 2014. S. 69-78.
7.
Zhang X., Zhao J., LeCun Y. Character-level Convolutional Networks for Text Classification // Advances in Neural Information Processing Systems.-NY, USA: Curran Associates, 2015.-S. 649-658.
8.
GloVe: Global Vectors for Word Representation // Stanford NLP URL: https://nlp.stanford.edu/projects/glove/ (data obrashcheniya: 10.10.2017).
9.
FastText-Library for fast text representation and classification // GitHub URL: https://github.com/facebookresearch/fastText (data obrashcheniya: 10.10.2017)
10.
Mikolov T., Sutskever I., Chen K., Corrado G., Dean J., Distributed representations of words and phrases and their compositionality // Advances in neural information processing systems.-2013.-26.-S. 3111-3119.
11.
Johnson R., Zhang T. Neural Networks for Text Categorization: Shallow Word-level vs. Deep Character-level // arXiv URL: https://arxiv.org/abs/1609.00718 (data obrashcheniya: 10.10.2017).
12.
Baccianella S., Esuli A., Sebastiani F. SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. // Proceedings of the International Conference on Language Resources and Evaluation.-Valletta, Malta: European Language Resources Association (ELRA, 2010.
13.
Twitter Sentiment Analysis Training Corpus // Thinkbook URL: http://thinknook.com/twitter-sentiment-analysis-training-corpus-dataset-2012-09-22/ (data obrashcheniya: 10.10.2017).
14.
Perform sentiment analysis with LSTMs, using TensorFlow // O'Reilly Media URL: Perform sentiment analysis with LSTMs, using TensorFlow (data obrashcheniya: 16.10.2017).
15.
Sak H., Senior A., Beaufays F. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling // INTERSPEECH-2014.-Singapore: ISC, 2014.-S. 338-342.
16.
Hong J., Fang M. Sentiment analysis with deeply learned distributed representations of variable length texts: tekhnicheskii otchet. Stanford, USA: Stanford University, 2015. 9 c