Enhancing generalization in Sickle Cell Disease diagnosis through ensemble methods and feature importance analysis

Show simple item record

dc.contributor.author Nataša Petrović
dc.contributor.author Gabriel Moyà-Alcover
dc.contributor.author Antoni Jaume-i-Capó
dc.contributor.author Jose Maria Buades Rubio
dc.date.accessioned 2025-01-29T11:33:44Z
dc.date.available 2025-01-29T11:33:44Z
dc.identifier.citation Petrović, N., Moyà-Alcover, G., Jaume-i-Capó, A., i Rubio, J. M. B. (2025). Enhancing generalization in Sickle Cell Disease diagnosis through ensemble methods and feature importance analysis. Engineering Applications of Artificial Intelligence, 142(109875). https://doi.org/https://doi.org/10.1016/j.engappai.2024.109875 ca
dc.identifier.uri http://hdl.handle.net/11201/168138
dc.description.abstract [eng] This work presents a novel approach for selecting the optimal ensemble-based classification method and features with a primarly focus on achieving generalization, based on the state-of-the-art, to provide diagnostic support for Sickle Cell Disease using peripheral blood smear images of red blood cells. We pre-processed and segmented the microscopic images to ensure the extraction of high-quality features. To ensure the reliability of our proposed system, we conducted an in-depth analysis of interpretability. Leveraging techniques established in the literature, we extracted features from blood cells and employed ensemble machine learning methods to classify their morphology. Furthermore, we have devised a methodology to identify the most critical features for classification, aimed at reducing complexity and training time and enhancing interpretability in opaque models. Lastly, we validated our results using a new dataset, where our model overperformed state-of-the-art models in terms of generalization. The results of classifier ensembled of Random Forest and Extra Trees classifier achieved an harmonic mean of precision and recall (F1-score) of 90.71% and a Sickle Cell Disease diagnosis support score (SDS-score) of 93.33%. These results demonstrate notable enhancement from previous ones with Gradient Boosting classifier (F1-score 87.32% and SDS-score 89.51%). To foster scientific progress, we have made available the parameters for each model, the implemented code library, and the confusion matrices with the raw data. en
dc.format application/pdf
dc.publisher Elsevier
dc.relation.ispartof Engineering Applications of Artificial Intelligence, 2025, vol. 142, num. 109875
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.classification 61 - Medicina
dc.subject.classification 57 - Biologia
dc.subject.other 61 - Medical sciences
dc.subject.other 57 - Biological sciences in general
dc.title Enhancing generalization in Sickle Cell Disease diagnosis through ensemble methods and feature importance analysis
dc.type info:eu-repo/semantics/article
dc.type info:eu-repo/semantics/acceptedVersion
dc.type Article
dc.date.updated 2025-01-29T11:33:44Z
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.identifier.doi https://doi.org/https://doi.org/10.1016/j.engappai.2024.109875


Files in this item

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 International Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International

Search Repository


Advanced Search

Browse

My Account

Statistics