Explainable Machine Learning Models for Credit Rating in Colombian Solidarity Sector Entities
- Publicada
- Servidor
- Preprints.org
- DOI
- 10.20944/preprints202507.0756.v1
This study presents a novel alternative to the standardized credit risk assessment model currently mandated by Colombia's Superintendence of the Solidarity Economy (SES). Addressing critical limitations of the regulatory model, particularly its reliance on binary variables and limited contextualization of institutional heterogeneity, this study develops and evaluates explainable machine learning (ML) models aligned with the Internal Ratings-Based (IRB) approach under Basel II. The proposed framework integrates continuous behavioral and financial variables with model-agnostic interpretability via SHAP (SHapley Additive exPlanations). Our proposal represents the first empirical application of such techniques within the Colombian solidarity finance sector. The data set comprises over 17,000 individual credit histories, segmented into debit and non-debit loans. Performance evaluation of multiple regression models, including Ridge, Decision Tree, Random Forest, XGBoost, and LightGBM, demonstrated that LightGBM achieves superior accuracy. Beyond predictive gains, the model enables granular interpretability and supports compliance with governance and accountability standards. The results highlight the potential of adaptive machine learning (ML) models to complement regulatory frameworks, enhance credit decision-making, and optimize capital allocation strategies in solidarity-based financial institutions.