A Discordance-Aware Multimodal Framework with Multi-Agent Clinical Reasoning
- Publicado
- Servidor
- Preprints.org
- DOI
- 10.20944/preprints202604.0342.v1
Background: Knee osteoarthritis (OA) frequently exhibits discordance between structural damage observed in imaging and patient-reported symptoms such as pain. This mismatch complicates clinical interpretation and patient stratification and remains insufficiently modeled in existing decision-support systems. Methods: We propose a discordance-aware multimodal framework that combines machine learning prediction models with a tool-grounded multi-agent reasoning system. Using baseline data from the FNIH Osteoarthritis Biomarkers Consortium (600 knees), we trained multimodal models to predict two progression tasks: (i) joint-space-loss-only progression versus non-progression and (ii) pain-only progression versus non-progression. The predictive system integrates three modality- specific experts: a CatBoost tabular model using demographic, radiographic, MRI-derived scalar, and biomarker features; MRI image embeddings extracted using a ResNet18 backbone; and X-ray embeddings derived from the same architecture. Expert predictions are fused using a stacking ensemble. Residual-based models estimate expected pain from structural features, enabling the computation of a pain–structure discordance score between observed and expected symptoms. A multi-agent reasoning layer interprets these signals to assign clinically interpretable OA phenotypes and generate phenotype-specific management recommendations. Results: Using 5-fold stratified cross-validation with out-of-fold evaluation, the full multimodal stacking model combining tabular variables, MRI and X-ray embeddings, and biochemical biomarkers achieved the best performance. For the JSL-only progression task, the model achieved AUC 0.702. For pain-only progression, the model achieved AUC 0.611. Imaging embeddings alone provided limited predictive signal, whereas clinically interpretable radiographic and MRI scalar features contributed stronger discrimination. Multimodal fusion improved performance by integrating complementary structural and biochemical information. Conclusions: Multimodal fusion of tabular clinical variables, imaging-derived features, deep image embeddings, and biochemical biomarkers improves structural progression prediction in knee OA. By coupling this prediction layer with explicit pain–structure discordance modeling and a tool-grounded multi-agent reasoning framework, the proposed architecture supports interpretable phenotype assignment and structured clinical decision support for osteoarthritis management.