مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Verion

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

1,165
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

0
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

SUB-WORD IMAGE CLUSTERING IN OLD PRINTED DOCUMENTS USING TEMPLATE MATCHING

Pages

  85-93

Abstract

 Due to the rapid growth of digital libraries, digitizing LARGE DOCUMENTs has become an important topic. In a quite long book, similar characters, sub-words and words will occur many times. In this paper, we propose a sub-word image clustering method for the applications dealing with large uniform documents. We assumed that the whole document is printed in a single font and print quality is not good. To test our method, we created a DATASET of all sub-words of a Farsi book. The book has 233 pages with more than 111000 sub-words manually labeled. We use an INCREMENTAL CLUSTERING algorithm. Four simple features are extracted from each sub-word and compared with the corresponding features of each cluster center. If all features' differences lie within certain thresholds, the sub-word and the winner cluster center are finely compared using a template matching algorithm. In our experiments, we show that all sub-words of the book are recognized with more than 99.7% accuracy by assigning the label of each cluster center to all of its members.

Cites

  • No record.
  • References

    Cite

    APA: Copy

    SOHEILI, M.R., & KABIR, E.. (2014). SUB-WORD IMAGE CLUSTERING IN OLD PRINTED DOCUMENTS USING TEMPLATE MATCHING. NASHRIYYAH -I MUHANDISI -I BARQ VA MUHANDISI -I KAMPYUTAR -I IRAN, B- MUHANDISI -I KAMPYUTAR, 11(2), 85-93. SID. https://sid.ir/paper/228474/en

    Vancouver: Copy

    SOHEILI M.R., KABIR E.. SUB-WORD IMAGE CLUSTERING IN OLD PRINTED DOCUMENTS USING TEMPLATE MATCHING. NASHRIYYAH -I MUHANDISI -I BARQ VA MUHANDISI -I KAMPYUTAR -I IRAN, B- MUHANDISI -I KAMPYUTAR[Internet]. 2014;11(2):85-93. Available from: https://sid.ir/paper/228474/en

    IEEE: Copy

    M.R. SOHEILI, and E. KABIR, “SUB-WORD IMAGE CLUSTERING IN OLD PRINTED DOCUMENTS USING TEMPLATE MATCHING,” NASHRIYYAH -I MUHANDISI -I BARQ VA MUHANDISI -I KAMPYUTAR -I IRAN, B- MUHANDISI -I KAMPYUTAR, vol. 11, no. 2, pp. 85–93, 2014, [Online]. Available: https://sid.ir/paper/228474/en

    Related Journal Papers

    Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button