مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

7
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

Persian Slang Text Conversion to Formal and Deep Learning of Persian Short Texts on Social Media for Sentiment Classification

Pages

  27-42

Abstract

 Background and Objectives: The lack of a suitable tool for the analysis of conversational texts in Persian language has made various analyzes of these texts, including Sentiment Analysis, difficult. In this research, it has we tried to make the understanding of these texts easier for the machine by providing PSC, Persian Slang Convertor, a tool for converting conversational texts into formal ones, and by using the most up-to-date and best Deep Learning methods along with the PSC, the sentiment learning of short Persian language texts for the machine in a better way.Methods: Be made More than 10 million unlabeled texts from various social networks and movie subtitles (as dialogue texts) and about 10 million news texts (as official texts) have been used for training unsupervised models and formal implementation of the tool. 60,000 texts from the comments of Instagram social network users with positive, negative, and neutral labels are considered as supervised data for training the emotion classification model of short texts. The latest methods such as LSTM, CNN, BERT, ELMo, and deep processing techniques such as learning rate decay, regularization, and dropout have been used. LSTM has been utilized in the research, and the best accuracy has been achieved using this method.Results: Using the official tool, 57% of the words of the corpus of conversation were converted. Finally, by using the formalizer, FastText model and deep LSTM network, the accuracy of 81.91 was obtained on the test data.Conclusion: In this research, an attempt was made to pre-train models using unlabeled data, and in some cases, existing pre-trained models such as ParsBERT were used. Then, a model was implemented to classify the Sentiment of Persian short texts using labeled data.

Multimedia

  • No record.
  • Cites

  • No record.
  • References

  • No record.
  • Cite

    Related Journal Papers

  • No record.
  • Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button