Optimization of tree-based machine learning algorithms for improving the predictive accuracy of hepatitis C disease
Küçük Resim Yok
Tarih
2024
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Elsevier
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Hepatitis C is a globally prevalent viral infection that has the potential to cause significant liver-related complications if not appropriately managed. The timely and precise identification of the medical condition is imperative for the efficient administration of patient care and therapy. One of the precise and potential diagnosis methods in the identification of hepatitis C is the utilization of machine learning (ML) algorithms. The present investigation focuses on the optimization of four ML algorithms which are tree-based algorithms, namely, random forest (RF), gradient boosting machines (GBMs), light gradient boosting machines (LGBMs), and extreme gradient boosting (XGBoost) with the aim of enhancing the predictive accuracy of hepatitis C disease. The investigation utilized a reliable dataset from the University of California, Irvine (UCI) Machine Learning Repository. The research methodology encompasses various stages, including data preprocessing, feature selection, hyperparameter tuning, and model evaluation. Optimization techniques, including the synthetic minority oversampling technique (SMOTE) for data balancing and grid search optimization for hyperparameter tuning, were utilized to improve the models’ performance. The optimized models were assessed through the utilization of stratified k-fold cross-validation and performance metrics, which comprise accuracy, precision, recall, F1-score, and area under the receiver operating characteristic (ROC) curve. The findings of our study indicate that the optimized tree-based algorithms exhibit superior performance compared to their nonoptimized counterparts. Specifically, LGBM demonstrated the highest level of predictive accuracy at 98.91%, followed by XGBoost at 98.70%, GBM at 97.83%, and RF at 97.29%. The LGBM learning approach has the potential to be broadly applied and extended to diverse medical datasets and use cases, thus advancing ML in the healthcare domain. The study highlights the importance of optimizing tree-based algorithms to improve the accuracy of early prediction of the prevalence of hepatitis C disease and promote patient health. This underscores the capacity of ML to improve healthcare outcomes. © 2024 Elsevier Inc. All rights reserved.
Açıklama
Anahtar Kelimeler
Extreme Gradient Boosting Machines, Gradient Boosting Machines, Hepatitis C Disease Prediction, Hyperparameter Optimization, Light Gradient Boosting Machines, Machine Learning, Random Forest, SMOTE
Kaynak
Decision-Making Models: A Perspective of Fuzzy Logic and Machine Learning
WoS Q Değeri
Scopus Q Değeri
N/A
Cilt
Sayı
Künye
Bai, F. J. J. S., & Jasmine, R. A. (2024). Optimization of tree-based machine learning algorithms for improving the predictive accuracy of hepatitis C disease. In Decision-Making Models (pp. 523-545). Academic Press.