Ir para o conteúdo principal

Escrever uma avaliação PREreview

Identifying key residues in intrinsically disordered regions of proteins using machine learning

Publicado
Servidor
bioRxiv
DOI
10.1101/2022.12.09.519711

Conserved residues in protein homolog sequence alignments are structurally or functionally important. For intrinsically disordered proteins (IDPs) or proteins with intrinsically disordered regions (IDRs), however, alignment often fails because they lack a steric structure to constrain evolution. Although sequences vary, the physicochemical features of IDRs may be preserved in maintaining function. Therefore, a method to retrieve common IDR features may help identify functionally important residues. We applied un-supervised contrastive learning to train a model with self-attention neuronal networks on human IDR orthologs. During training, parameters were optimized to match sequences in ortholog pairs but not in other IDRs. The trained model successfully identifies previously reported critical residues from experimental studies, especially those with an overall pattern (e.g. multiple aromatic residues or charged blocks) rather than short motifs. This predictive model can therefore be used to identify potentially important residues in other proteins.

Availability and implementation

The training scripts are available on GitHub (https://github.com/allmwh/IFF). The training datasets have been deposited in an Open Science Framework repository (https://osf.io/jk29b). The trained model can be run from the Jupyter Notebook in the GitHub repository using Binder (mybinder.org). The only required input is the primary sequence.

Você pode escrever uma avaliação PREreview de Identifying key residues in intrinsically disordered regions of proteins using machine learning. Uma avaliação PREreview é uma avaliação de um preprint e pode variar de algumas frases a um parecer extenso, semelhante a um parecer de revisão por pares realizado por periódicos.

Antes de começar

Vamos pedir que você faça login com seu ORCID iD. Se você não tiver um iD, pode criar um.

O que é um ORCID iD?

Um ORCID iD é um identificador único que diferencia você de outras pessoas com o mesmo nome ou nome semelhante.

Começar agora