SG-MuRCL: Smoothed Graph-Enhanced Multi-Instance Contrastive Learning for Robust Whole Slide Image Classification
- Posted
- Server
- Preprints.org
- DOI
- 10.20944/preprints202512.0100.v1
Multiple Instance Learning (MIL) is a standard paradigm for classifying gigapixel whole-slide images (WSIs). However, prominent models such as Attention-Based MIL (ABMIL) treat image patches as independent instances, ignoring their inherent spatial context. More advanced frameworks like MuRCL employ reinforcement learning for instance selection but do not explicitly enforce spatial coherence, often resulting in noisy localizations. Although Graph Neural Networks (GNNs), attention smoothing, and reinforcement learning (RL) are each powerful, state-of-the-art strategies for addressing these issues individually, their integration remains a significant challenge. This paper introduces SG-MuRCL, a framework that enhances MuRCL by first employing a GNN to model spatial relationships—departing from ABMIL’s independence assumption—and second incorporating an attention-smoothing operator to regularize the MIL aggregator, aiming to improve robustness by generating more coherent and clinically meaningful heatmaps. Empirical evaluation yielded an important finding: while the baseline MuRCL trained successfully, the integrated SG-MuRCL consistently collapsed into a trivial solution. This outcome shows that the theoretical synergy between GNNs, attention smoothing, and RL does not trivially translate into practice. The contribution of this work is therefore not a high-performing model, but a concrete demonstration of the scalability and stability challenges that arise when unifying these advanced paradigms.