Machine learning early warning for financial distress in health plan operators
- Posted
- Server
- SciELO Preprints
- DOI
- 10.1590/scielopreprints.15181
This study develops the first machine learning–based early warning system for financial distress among Brazilian health plan operators. Using 24,440 operator-quarter observations from public regulatory data (2018–2025), we train and temporally validate LASSO logistic regression, random forest, and XGBoost to predict distress two to four quarters ahead. Random forest achieved the highest discrimination (AUC = 0.847), but LASSO exhibited the smallest generalization gap (0.014), revealing a tension between accuracy and deployment reliability that carries direct implications for regulatory design. The extended combined ratio was the only consensus predictor across all methods, and temporal features added predictive value beyond static ratio levels. A retrospective case study on Brazil's largest operator demonstrated early detection capacity but also revealed model habituation under prolonged stress. The study extends early warning theory from banking to health insurance and demonstrates that open regulatory data can support proactive, transparent surveillance where operator failure directly affects healthcare access.