مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

video

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

sound

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Persian Version

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View:

210
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download:

155
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Cites:

Information Journal Paper

Title

Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining

Pages

  79-89

Abstract

 This paper presents a Data Mining application in Metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing Cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of Imbalanced Classes which is known to deteriorate the performance of classifiers. It also influences its validity and generalizablity. The classification models in this study were built using five machine learning algorithms known as PLS-DA, MLP, SVM, C4. 5 and ID3. This model is built after carrying out a number of intensive data Preprocessing procedures to tackle the problem of Imbalanced Classes and improve the performance of the constructed classifiers. These procedures involves applying data transformation, normalization, standardization, Re-sampling and Data Reduction procedures using a number of variables importance scorers. The best performance was achieved by building an MLP model that was trained and tested using five-fold cross-validation using datasets that were re-sampled using SMOTE method and then reduced using SVM variable importance scorer. This model was successful in classifying samples with excellent accuracy and also in identifying the potential disease biomarkers. The results confirm the validity of Metabolomics Data Mining for diagnosis of Cachexia. It also emphasizes the importance of data Preprocessing procedures such as sampling and Data Reduction for improving Data Mining results, particularly when data suffers from the problem of Imbalanced Classes.

Multimedia

  • No record.
  • Cites

  • No record.
  • References

  • No record.
  • Cite

    APA: Copy

    BaniMustafa, Ahmed. (2019). Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining. THE ISC INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 11(3), 79-89. SID. https://sid.ir/paper/725809/en

    Vancouver: Copy

    BaniMustafa Ahmed. Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining. THE ISC INTERNATIONAL JOURNAL OF INFORMATION SECURITY[Internet]. 2019;11(3):79-89. Available from: https://sid.ir/paper/725809/en

    IEEE: Copy

    Ahmed BaniMustafa, “Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining,” THE ISC INTERNATIONAL JOURNAL OF INFORMATION SECURITY, vol. 11, no. 3, pp. 79–89, 2019, [Online]. Available: https://sid.ir/paper/725809/en

    Related Journal Papers

  • No record.
  • Related Seminar Papers

  • No record.
  • Related Plans

  • No record.
  • Recommended Workshops






    Move to top
    telegram sharing button
    whatsapp sharing button
    linkedin sharing button
    twitter sharing button
    email sharing button
    email sharing button
    email sharing button
    sharethis sharing button