Study identifies human proteins with segments devoid of genetic variation

By Leah Mann

Headshot of Charles Sanders, who is wearing a tweed plaid jacket.
Charles Sanders, Ph.D.

The lab of Charles Sanders, professor of biochemistry and the Aileen M. Lange and Annie Mary Lyle Chair for Cardiovascular Research, published a study in Protein Science identifying all human proteins that have at least one segment that does not have any missense mutations in the collection of genes in a gene database called gnomAD. A missense mutation is a single nucleotide change that can result in the substitution of one amino acid for a different one. The generated list could help researchers identify proteins that contain segments in which mutations result in such catastrophic consequences that they are filtered out of the human population.

The Sanders lab completed this work in collaboration with the labs of Lars Plate, assistant professor of chemistry and biological sciences, and David Samuels, associate professor of molecular physiology and biophysics. The co-first authors were Adam Sanders* and Jake Hermanson, a graduate student in the Department of Biological Sciences.

Lars Plate, Ph.D.

We sat down with Charles Sanders to learn more about this research.

What issue/problem does your research address?

Our research strives to answer the following question: Are there human genes (and their encoded proteins) that contain segments in which sequence variations are not seen in the human population?

What were your findings?

David Samuels, Ph.D.

We discovered that 250 proteins have amino acid segments in them that are seen to never vary in sequence in the human population based on an analysis of the more than 105 human genome/exome sequences present in the gnomAD database (version 2.1). We called these “zero-tolerance proteins” because evolution seems to be intolerant of sequence variations in certain segments of these genes/proteins. These are evidently very important proteins as mutations in them are completely filtered out of the gene pool. Certain functional classes of proteins, especially RNA-binding proteins and proteins involved in mRNA splicing, were particularly common in the set of 250.

It is important to note that, after we published this paper, we found that segments we had found to be “zero tolerant” in the gnomAD 2.1 collection are nevertheless usually found to be associated with some gene variations in other human gene variant databases. Thus, most of the identified “zero tolerant” segments are not absolutely zero tolerant in the entire human population, but rather are segments where variation is rare.

Jake Hermanson

What do you hope will be achieved with the research results in the short term?

The information provided by intolerance analysis is in many ways very different from what you get from sequence homology patterns. Sequence homology patterns can tell you what is important in a protein because their associated DNA sequences are seen to be shared across species; intolerance tells you what is important because of variations that are not observed in the human population. We therefore hope to use this list of proteins to point to undiscovered but super important human biology.

Adam Sanders

What are the long-term benefits of this research?

A good number of these proteins may be involved in human reproduction or conception-to-adulthood human development, so understanding how defects in these proteins cause problems with fertility and/or human development may provide clues on how to treat them.

Where is this research taking you next?

We want to choose interesting examples of proteins containing intolerant segments and explore both the related biology and the mechanisms that are responsible for filtering the non-tolerated mutations out of the human gene pool.

Funding

This work was funded by the National Institutes of Health.

Go Deeper

The paper “Compendium of proteins containing segments that exhibit zero-tolerance to amino acid variation in humans” was published in Protein Science in August 2022.

 

*Adam, who is a musician and Sanders’ son, carried out a type of analysis called MTR for each of the 20,000 human protein-encoding genes.