مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

164
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

123
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Seminar Paper

Title

A MODIFIED LANGUAGE MODELING METHOD FOR AUTHORSHIP ATTRIBUTION

Pages

  -

Abstract

 THIS PAPER PRESENTS AN APPROACH TO A CLOSED-CLASS AUTHORSHIP ATTRIBUTION (AA) PROBLEM. IT IS BASED ON LANGUAGE MODELING FOR CLASSIFICATION AND CALLED MODIFIED LANGUAGE MODELING. MODIFIED LANGUAGE MODELING AIMS TO OFFER A SOLUTION FOR AA PROBLEM BY COMBINATIONS OF BOTH BIGRAM WORDS WEIGHTING AND UNIGRAM WORDS WEIGHTING. IT MAKES THE RELATION BETWEEN UNSEEN TEXT AND TRAINING DOCUMENTS CLEARER WITH GIVING EXTRA REWARD OF TRAINING DOCUMENTS; TRAINING DOCUMENT INCLUDING BIGRAM WORD AS WELL AS UNIGRAM WORDS. MOREOVER, IDF VALUE MULTIPLIED BY RELATED WORD PROBABILITY HAS BEEN USED, INSTEAD OF REMOVING STOP WORDS WHICH ARE PROVIDED BY STOP WORDS LIST. WE EVALUATE EXPERIMENTAL RESULTS BY FOUR APPROACHES; UNIGRAM, BIGRAM, TRIGRAM AND MODIFIED LANGUAGE MODELING BY USING TWO PERSIAN POEM CORPORA AS WMPR-AA2016-A DATASET AND WMPRAA2016- B DATASET. RESULTS SHOW THAT MODIFIED LANGUAGE MODELING ATTRIBUTES AUTHORS BETTER THAN OTHER APPROACHES. THE RESULT ON WMPR-AA2016-B, WHICH IS BIGGER DATASET, IS MUCH BETTER THAN ANOTHER DATASET FOR ALL APPROACHES. THIS MAY INDICATE THAT IF ADEQUATE DATA IS PROVIDED TO TRAIN LANGUAGE MODELING THE MODIFIED LANGUAGE MODELING CAN BE A GOOD SOLUTION TO AA PROBLEM.

Multimedia

  • No record.
  • Cites

  • No record.
  • References

  • No record.
  • Cite

    APA: Copy

    Vazirian, Samane, & ZAHEDI, MORTEZA. (2016). A MODIFIED LANGUAGE MODELING METHOD FOR AUTHORSHIP ATTRIBUTION. INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY. SID. https://sid.ir/paper/947368/en

    Vancouver: Copy

    Vazirian Samane, ZAHEDI MORTEZA. A MODIFIED LANGUAGE MODELING METHOD FOR AUTHORSHIP ATTRIBUTION. 2016. Available from: https://sid.ir/paper/947368/en

    IEEE: Copy

    Samane Vazirian, and MORTEZA ZAHEDI, “A MODIFIED LANGUAGE MODELING METHOD FOR AUTHORSHIP ATTRIBUTION,” presented at the INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY. 2016, [Online]. Available: https://sid.ir/paper/947368/en

    Related Journal Papers

  • No record.
  • Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button