مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Verion

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

521
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

0
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

A New Approach for Extracting Named Entity in Classical Arabic

Pages

  59-74

Keywords

Named entity recognition (NER)Q2

Abstract

 In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and Effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers. While most of these researches are based on Modern Standard Arabic (MSA), in this paper, we focus on Classical Arabic (CA) literature. We propose a corpus called NoorCorp with 200k labeled words for research purposes which is annotated by expert human resources manually. We also collected about 18k proper names from old Hadith books as gazetteer which is called NoorGazet. Using Ensemble learning, we develop a new approach for extraction of named entities (NEs) including person, location and organization. Adaboost. M2 algorithm, as implementation of multiclass Boosting method, is applied to train the prediction model. Results show that performance of the method is better than decision tree as the base classifier. We have used tokenizing, part of speech (POS) tagging, and base phrase chunking (BPC) to overcome linguistic obstacles in Arabic. An overall F-measure value of 86. 85 is obtained. Finally, the proposed approach is applied on ANERCorp as MSA corpus and we have compared the results with NoorCorp.

Cites

  • No record.
  • References

  • No record.
  • Cite

    APA: Copy

    Sajadi, Seyed mohammad bagher, RASHIDI, HASAN, & MINAEI, BEHROUZ. (2017). A New Approach for Extracting Named Entity in Classical Arabic. SIGNAL AND DATA PROCESSING, 14(2 (serial 32) ), 59-74. SID. https://sid.ir/paper/160751/en

    Vancouver: Copy

    Sajadi Seyed mohammad bagher, RASHIDI HASAN, MINAEI BEHROUZ. A New Approach for Extracting Named Entity in Classical Arabic. SIGNAL AND DATA PROCESSING[Internet]. 2017;14(2 (serial 32) ):59-74. Available from: https://sid.ir/paper/160751/en

    IEEE: Copy

    Seyed mohammad bagher Sajadi, HASAN RASHIDI, and BEHROUZ MINAEI, “A New Approach for Extracting Named Entity in Classical Arabic,” SIGNAL AND DATA PROCESSING, vol. 14, no. 2 (serial 32) , pp. 59–74, 2017, [Online]. Available: https://sid.ir/paper/160751/en

    Related Journal Papers

    Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button