Skip to main content

Write a PREreview

Efficient Assessment of the Risk of Elevated Aspartate Aminotransferase Using Machine Learning Methods Based on Routine Biochemical Markers

Posted
Server
Preprints.org
DOI
10.20944/preprints202506.2273.v1

This study proposes an interpretable and high-accuracy ensemble learning framework for predicting aspartate aminotransferase (AST) levels using open-access biomedical datasets. Using a structured pipeline of preprocessing, feature selection, and model ensembling, we evaluated a series of regression algorithms including Random Forest, XGBoost, CatBoost, and three stacking architectures. The best-performing ensemble (Stacking_v2) achieved R² = 0.98 and RMSE = 1.23 on the validation set, surpassing conventional and single-model approaches. Feature importance was assessed using SHAP values, mutual information, and correlation analysis, revealing that gamma-glutamyl transferase, ferritin, and anthropometric markers had the greatest predictive impact. The proposed stacking-based model demonstrates excellent generalization, robust calibration, and high interpretability, and can serve as a benchmark for algorithmic evaluation in medical data modeling. The work highlights the effectiveness of ensemble regression and interpretable AI in real-world clinical prediction tasks using routine biomarkers.

You can write a PREreview of Efficient Assessment of the Risk of Elevated Aspartate Aminotransferase Using Machine Learning Methods Based on Routine Biochemical Markers. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now