Skip to main content

Write a PREreview

Large-scale exploration of protein space by automated NMR

Posted
Server
bioRxiv
DOI
10.64898/2026.02.16.706194

Protein structures can now be predicted and designed at scale, yet experimental access to dynamics and conformational heterogeneity remains limited in throughput. This gap prevents a systematic understanding of how protein sequences encode motion and functional flexibility. Here, we establish a scalable experimental pipeline combining protein design, automated production, and nuclear magnetic resonance (NMR) spectroscopy to enable high-throughput characterization of protein structure and dynamics at atomic resolution. A single operator can produce and analyze hundreds of isotopically labeled proteins per week, with per-sample cost largely defined by DNA synthesis. To benchmark this approach, we experimentally characterized 384 de novo designed proteins spanning diverse regions of structure space. High-quality two-dimensional NMR spectra were obtained for 239 samples (62% of designs overall). NMR characterization confirmed that the designed proteins adopt their intended folds, and revealed unexpected local dynamics that are not captured by current computational models. Our approach establishes a foundation for data-driven modelling of sequence–structure–dynamics relationships and unlocks a new regime of statistical structural biology, where insight into protein biophysics is gained from experimental ensemble studies of suitably designed protein clusters.

You can write a PREreview of Large-scale exploration of protein space by automated NMR. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now