مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Verion

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

1,292
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

0
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

Detecting Similarity in Paraphrased Persian Texts using Semantic and Probabilistic Methods

Pages

  1823-1848

Abstract

Plagiarism detection is the process of locating instances of Plagiarism within a work or a document. The main component of a Plagiarism detection system is its text alignment algorithm aiming at detecting paraphrased passages of texts in a suspicious document, using a small set of candidate source documents. As text alignment algorithms are highly language-dependent, thus the numerous existing algorithms for other languages other than Pesian cannot be employed for Persian Plagiarism detection puposes. There are different text alignment algorithms for Persian texts, while most of them are only able to detect exactly identical passages shared between texts. However, in many cases of Plagiarism detection we are coping with the problem of finding similar passages that are already paraphrased. In this paper, we propose two new text alignment algorithms which are able to detect Paraphrased Texts in Persian language. The first one is a semantic algorithm that employs a dictionary to detect paraphrased sentences and the second one is a probabilistic algorithm that uses the statistical information obtained from a large corpus of Persian texts to detect similar texts. Compared to other existing Semantic Text Alignment algorithms, the proposed algorithms use different measures to check the similarity between the text sentences. Furthermore, the probabilistic algorithm is the first Probabilistic Text Alignment algorithm proposed for the Persian language. Moreover, while all existing text alignment algorithms check the similarity between any two sentences of the text separately, the proposed algorithms consider the similarity neighboring sentences in the text as well. The implementation results indicate that while the quality of both algorithms in detecting Paraphrased Texts is high enough and almost the same as each other, the proposed probabilistic method is more efficient than the proposed semantic algorithm in terms of computation time.

Cites

  • No record.
  • References

  • No record.
  • Cite

    APA: Copy

    PAKNIAT, NASROLLAH, & MOHEBI, AZADEH. (2019). Detecting Similarity in Paraphrased Persian Texts using Semantic and Probabilistic Methods. IRANIAN JOURNAL OF INFORMATION PROCESSING & MANAGEMENT (INFORMATION SCIENCES AND TECHNOLOGY), 34(4 ), 1823-1848. SID. https://sid.ir/paper/131163/en

    Vancouver: Copy

    PAKNIAT NASROLLAH, MOHEBI AZADEH. Detecting Similarity in Paraphrased Persian Texts using Semantic and Probabilistic Methods. IRANIAN JOURNAL OF INFORMATION PROCESSING & MANAGEMENT (INFORMATION SCIENCES AND TECHNOLOGY)[Internet]. 2019;34(4 ):1823-1848. Available from: https://sid.ir/paper/131163/en

    IEEE: Copy

    NASROLLAH PAKNIAT, and AZADEH MOHEBI, “Detecting Similarity in Paraphrased Persian Texts using Semantic and Probabilistic Methods,” IRANIAN JOURNAL OF INFORMATION PROCESSING & MANAGEMENT (INFORMATION SCIENCES AND TECHNOLOGY), vol. 34, no. 4 , pp. 1823–1848, 2019, [Online]. Available: https://sid.ir/paper/131163/en

    Related Journal Papers

    Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button