مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Verion

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

680
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

0
Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

Author gender identification from text using Bayesian Random Forest

Pages

  143-156

Abstract

 Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields, from personalized advertising to law enforcement of reputation management. Text posts represent a large portion of user generated content, and contain information which can be relevant to discovering undisclosed user attributes, or investigating the honesty of self-reported age and gender. Because the highest rate of information exchanges is in text format, author identification from the aspects like age, gender, political and religious opinions from these contents will seem more considerable. Gender identification that could be useful in security and marketing, also answers the following question: given a short text document, can we identify if the author is a male or a female? This question is motivated by recent events where people faked their gender on the Internet. In this paper, Author gender identification in blog’ s data is investigated. In this regard, four groups of features include syntactic features, word-based features, character-based features, and function words are employed. In addition, character n-gram features is used for improving the accuracy of Classification. For evaluation of the proposed method, 3212 texts were collected from Technorati. com and blogger. com. Experimental results demonstrate that these types of features are practical. furthermore, a new Classification method called "Bayesian Random Forest" is introduced. Each tree in Bayesian Random Forest is a Bayes tree. The results of experiment show that this method attains noticeable results in comparison with other Classification algorithms such as Naï ve Bayes, Naï ve Bayes Tree, and Random Forest and it increases accuracy of gender identification to 89. 5%.

Cites

  • No record.
  • References

  • No record.
  • Cite

    APA: Copy

    SAJEDI, HEDIEH, & TASLIMI, MAHNAZ. (2019). Author gender identification from text using Bayesian Random Forest. SIGNAL AND DATA PROCESSING, 16(1 (39) ), 143-156. SID. https://sid.ir/paper/160889/en

    Vancouver: Copy

    SAJEDI HEDIEH, TASLIMI MAHNAZ. Author gender identification from text using Bayesian Random Forest. SIGNAL AND DATA PROCESSING[Internet]. 2019;16(1 (39) ):143-156. Available from: https://sid.ir/paper/160889/en

    IEEE: Copy

    HEDIEH SAJEDI, and MAHNAZ TASLIMI, “Author gender identification from text using Bayesian Random Forest,” SIGNAL AND DATA PROCESSING, vol. 16, no. 1 (39) , pp. 143–156, 2019, [Online]. Available: https://sid.ir/paper/160889/en

    Related Journal Papers

    Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top