مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

473
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

245
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

COMPUTING SEMANTIC SIMILARITY OF DOCUMENTS BASED ON SEMANTIC TENSORS

Pages

  125-134

Abstract

 Exploiting semantic content of texts due to its wide range of applications such as finding related documents to a query, document classification and computing SEMANTIC SIMILARITY of documents has always been an important and challenging issue in NATURAL LANGUAGE PROCESSING. In this paper, using Wikipedia corpus and organizing it by three-dimensional tensor structure, a novel corpus-based approach for computing SEMANTIC SIMILARITY of texts is proposed. For this purpose, first the semantic vector of available words in documents are obtained from the vector space derived from available words in Wikipedia articles, then the semantic vector of documents is formed according to their words vector. Consequently, SEMANTIC SIMILARITY of a pair of documents is computed by comparing their corresponding semantic vectors. Moreover, due to existence of high dimensional vectors, the vector space of Wikipedia corpus will cause curse of dimensionality. On the other hand, vectors in high-dimension space are Usually very similar to each other. In this way, it would be meaningless and vain to identify the most appropriate semantic vector for the words. Therefore, the proposed approach tries to improve the effect of the curse of dimensionality by reducing the vector space dimensions through RANDOM INDEXING. Moreover, the RANDOM INDEXING makes significant improvement in memory consumption of the proposed approach by reducing the vector space dimensions. Additionally, the capability of addressing synonymous and polysemous words will be feasible in the proposed approach by means of the structured co-occurrence through RANDOM INDEXING.

Cites

  • No record.
  • References

    Cite

    APA: Copy

    BAHRAMI, NAVID, JADIDINEJAD, AMIR HOSSEIN, & NAZARI, MOZHDEH. (2015). COMPUTING SEMANTIC SIMILARITY OF DOCUMENTS BASED ON SEMANTIC TENSORS. JOURNAL OF INFORMATION SYSTEMS AND TELECOMMUNICATION (JIST), 3(2), 125-134. SID. https://sid.ir/paper/332672/en

    Vancouver: Copy

    BAHRAMI NAVID, JADIDINEJAD AMIR HOSSEIN, NAZARI MOZHDEH. COMPUTING SEMANTIC SIMILARITY OF DOCUMENTS BASED ON SEMANTIC TENSORS. JOURNAL OF INFORMATION SYSTEMS AND TELECOMMUNICATION (JIST)[Internet]. 2015;3(2):125-134. Available from: https://sid.ir/paper/332672/en

    IEEE: Copy

    NAVID BAHRAMI, AMIR HOSSEIN JADIDINEJAD, and MOZHDEH NAZARI, “COMPUTING SEMANTIC SIMILARITY OF DOCUMENTS BASED ON SEMANTIC TENSORS,” JOURNAL OF INFORMATION SYSTEMS AND TELECOMMUNICATION (JIST), vol. 3, no. 2, pp. 125–134, 2015, [Online]. Available: https://sid.ir/paper/332672/en

    Related Journal Papers

  • No record.
  • Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button