How protein translation evolved from a simple beginning to its complex and accurate contemporary state is unknown. Aminoacyl-tRNA synthetases (AARSs) define the genetic code by activating amino acids and loading them onto cognate tRNAs. As such, their evolutionary history can shed light on early translation. Using structure-based alignments of the conserved core of Class I AARSs, we reconstructed their phylogenetic tree and ancestral states. Unexpectedly, AARSs charging amino acids that are assumed to have emerged later – such as TrpRS and TyrRS or LysRS and CysRS – appear as the earliest splits in the tree; conversely, those AARSs charging abiotic, early-emerging amino acids, e.g. ValRS, seem to have diverged most recently. Furthermore, the inferred Class I ancestor (excluding TrpRS and TyrRS) lacks the residues that mediate selectivity in contemporary AARSs, and appears to be a generalist that could charge a wide range of amino acids. This ancestor subsequently diverged to two clades: “charged” (which gave rise to ArgRS, GluRS, and GlnRS) and “hydrophobics”, which includes CysRS and LysRS as its outgroups. The ancestors of both clades maintain a wide-accepting pocket that could readily diverge to the contemporary, specialized families. Overall, our findings suggest a “generalist-maintaining” model of class I AARS evolution, in which early statistical translation was kept active by a generalist AARS while the evolution of a specialized, accurate translation system took place.
Significance
Aminoacyl-tRNA synthetases (AARS) define the genetic code by linking amino acids with their cognate tRNAs. While contemporary AARSs leverage exquisite molecular recognition and proofreading to ensure translational fidelity, early translation was likely less stringent and operated on a different pool of amino acids. The co-emergence of translational fidelity and the amino acid alphabet, however, is poorly understood. By inferring the evolutionary history of Class I AARSs we found seemingly conflicting signals: Namely, the oldest AARSs apparently operate on the youngest amino acids. We also observed that the early ancestors had broad amino acid specificities, consistent with a model of statistical translation. Our data suggests that a generalist AARS was actively maintained until complete specialization, thereby resolving the age paradox.