PREreviews of Predictors of COVID-19 hospital outcomes: a machine learning analysis of the National COVID Cohort Collaborative

Skip to preprint details Skip to PREreviews

Predictors of COVID-19 hospital outcomes: a machine learning analysis of the National COVID Cohort Collaborative

by Janette Vazquez, Loni Taylor, Yun-Yun K Chen, Katherine Araya, Madison G Farnsworth, Xiang Xue, Mahbubul Hasan, and N3C Consortium

Posted: March 9, 2026
Server: medRxiv
DOI: 10.64898/2026.03.06.26347822

Abstract

Predicting hospital outcomes for patients with severe acute respiratory infections is critical for risk stratification and resource planning, yet heterogeneous electronic health record (EHR) data, class imbalance, and evolving clinical practice present persistent methodological challenges for machine learning (ML) approaches. We conducted a retrospective cohort study using EHR data harmonized to the OMOP common data model from the National COVID Cohort Collaborative (N3C; May 2020-June 2025), including 263,619 adults hospitalized with COVID-19 across 51 contributing sites. We developed penalized linear regression (elastic net), random forest, XGBoost, and multilayer perceptron (MLP) models to predict hospital length of stay (LOS) and mortality (in-hospital and 60-day), using demographics, comorbidities, prior healthcare utilization, COVID-19 vaccination status, and hospital site as predictors. Missing data were handled via multiple imputation by chained equations (MICE) and class imbalance was addressed using SMOTE. Model performance was evaluated using area under the ROC curve (AUROC), Brier score, calibration plots, and decision curve analysis, following the TRIPOD reporting framework. Mortality prediction achieved moderate discrimination across all models (test AUROC = 0.71-0.73 for in-hospital mortality; 0.72-0.73 for 60-day all-cause mortality). Models trained without SMOTE achieved the highest AUROCs but assigned virtually no patients to the mortality class at the default 0.5 threshold. SMOTE improved recall and F-1 score at the cost of reduced AUROC and precision. LOS was poorly explained by available structured predictors (best R2 = 0.059). Remdesivir-treated patients (n = 103,536; 39.3%) were older, had higher comorbidity burden, and had higher unadjusted mortality than untreated patients. Common structured EHR features offer moderate utility for mortality risk stratification in hospitalized COVID-19 patients but are insufficient for LOS prediction. The consistent SMOTE-related tradeoff between discrimination and calibration underscores the need to report threshold-dependent metrics alongside AUROC in clinical ML studies, with implications for operational planning during future respiratory disease emergencies.

Read the preprint

1 PREreview

Write a PREreview Request a PREreview

PREreview by Jenelle Soo

Authored by Jenelle Soo

Does the introduction explain the objective of the research presented in the preprint?

Yes

The introduction clearly states the three prediction targets (LOS, in-hospital mortality, 60-day mortality), justifies the clinical need, situates the work relative to prior literature, and specifies the…

Read the PREreview by Jenelle Soo