dc.contributor.author |
Nataša Petrović |
|
dc.contributor.author |
Gabriel Moyà-Alcover |
|
dc.contributor.author |
Antoni Jaume-i-Capó |
|
dc.contributor.author |
Jose Maria Buades Rubio |
|
dc.date.accessioned |
2025-01-29T11:33:44Z |
|
dc.date.available |
2025-01-29T11:33:44Z |
|
dc.identifier.citation |
Petrović, N., Moyà-Alcover, G., Jaume-i-Capó, A., i Rubio, J. M. B. (2025). Enhancing generalization in Sickle Cell Disease diagnosis through ensemble methods and feature importance analysis. Engineering Applications of Artificial Intelligence, 142(109875). https://doi.org/https://doi.org/10.1016/j.engappai.2024.109875 |
ca |
dc.identifier.uri |
http://hdl.handle.net/11201/168138 |
|
dc.description.abstract |
[eng] This work presents a novel approach for selecting the optimal ensemble-based classification method and features with a primarly focus on achieving generalization, based on the state-of-the-art, to provide diagnostic support for Sickle Cell Disease using peripheral blood smear images of red blood cells. We pre-processed and segmented the microscopic images to ensure the extraction of high-quality features. To ensure the reliability of our proposed system, we conducted an in-depth analysis of interpretability. Leveraging techniques established in the literature, we extracted features from blood cells and employed ensemble machine learning methods to classify their morphology. Furthermore, we have devised a methodology to identify the most critical features for classification, aimed at reducing complexity and training time and enhancing interpretability in opaque models. Lastly, we validated our results using a new dataset, where our model overperformed state-of-the-art models in terms of generalization. The results of classifier ensembled of Random Forest and Extra Trees classifier achieved an harmonic mean of precision and recall (F1-score) of 90.71% and a Sickle Cell Disease diagnosis support score (SDS-score) of 93.33%. These results demonstrate notable enhancement from previous ones with Gradient Boosting classifier (F1-score 87.32% and SDS-score 89.51%). To foster scientific progress, we have made available the parameters for each model, the implemented code library, and the confusion matrices with the raw data. |
en |
dc.format |
application/pdf |
|
dc.publisher |
Elsevier |
|
dc.relation.ispartof |
Engineering Applications of Artificial Intelligence, 2025, vol. 142, num. 109875 |
|
dc.rights |
Attribution-NonCommercial-NoDerivatives 4.0 International |
|
dc.rights.uri |
https://creativecommons.org/licenses/by-nc-nd/4.0/ |
|
dc.subject.classification |
61 - Medicina |
|
dc.subject.classification |
57 - Biologia |
|
dc.subject.other |
61 - Medical sciences |
|
dc.subject.other |
57 - Biological sciences in general |
|
dc.title |
Enhancing generalization in Sickle Cell Disease diagnosis through ensemble methods and feature importance analysis |
|
dc.type |
info:eu-repo/semantics/article |
|
dc.type |
info:eu-repo/semantics/acceptedVersion |
|
dc.type |
Article |
|
dc.date.updated |
2025-01-29T11:33:44Z |
|
dc.rights.accessRights |
info:eu-repo/semantics/openAccess |
|
dc.identifier.doi |
https://doi.org/https://doi.org/10.1016/j.engappai.2024.109875 |
|