Additive baselines furnish no evidence for epistasis learning by MULTI-evolve
- Publicada
- Servidor
- bioRxiv
- DOI
- 10.64898/2026.04.23.719915
Recent work from Tran et al . ( Science , 2026) introduced MULTI-evolve, a framework for protein engineering that combines single-mutant nomination via a protein language model (PLM) or a deep mutational scan (DMS), experimental single- and double-mutant characterization, and neural networks to engineer hyperactive multimutant proteins. The authors attribute the framework’s performance to “epistasis-aware modeling” and claim that their neural networks “learn the epistatic landscape” and “identify synergistic interactions” from limited double-mutant training data. Additive models, by definition, cannot represent epistasis, making them a natural null baseline for such claims. Here we show that MULTI-evolve’s multimutant predictions are almost perfectly correlated with an additive model’s across all three engineering applications (APEX, dCasRx, and HuABC2), such that the engineering of multimutants reduces to combining beneficial mutations with the largest additive effects—a standard protein engineering strategy for over four decades. We also find that MULTI-evolve’s neural networks do not outperform an additive model on held-out test set predictions, and do not even represent epistasis in their training data. Finally, we revisit a DMS benchmark finding presented as evidence of epistasis learning and show that the same pattern is expected even under a null additive model, due to an elementary statistical phenomenon; when we fit an additive model to the benchmark data, it reproduces the reported pattern. More broadly, our findings underscore the need to benchmark models for machine learning-guided directed evolution against additive null baselines before attributing performance to learned epistasis.