Researchers use powerful AI tool to gain new insights into protein structures

A protein containing amino acids with fixed spiral and ribbon structures, in blue and light blue, as well as thread-like disordered regions in orange (photo by AlphaFold Protein Structure Database (CC BY 4.0).

An international team of researchers has revealed new insights about the three-dimensional structure of certain types of proteins by using the powerful artificial intelligence tool AlphaFold2.

Long molecules comprising strings of amino acids, proteins are folded into three-dimensional structures according to a strict set of rules. The myriad of different structures enable proteins to perform their functions. Within organisms, from bacteria to humans, they transport molecules, act as catalysts for chemical processes, operate as valves and pumps – and much more.

While AlphaFold2 has predicted the three-dimensional structure of some 200 million proteins, it has until now been unable to determine whether sections within certain proteins, known as intrinsically disordered regions (IDRs), have any structure at all – much less predict the shape of that structure.

""
Alan Moses (supplied image)

“This has been a long-standing debate amongst biochemists and molecular biologists – whether IDRs have fixed structure or whether they’re just ‘floppy’ parts of proteins,” says Alan Moses, a computational biologist and professor in the department of cell and systems biology in the University of Toronto’s Faculty of Arts & Science.

“We confirmed that, [while] AlphaFold2 still can’t predict the structure of IDRs very well … what it can do is tell us which IDRs are likely to have some structure – something that was previously impossible.” 

Moses is a co-author of a new paper, published in the journal Proceedings of the National Academy of Sciences, that details the research team’s findings and could lead to a better understanding of the role played by these proteins in disease and to the development of new drug treatments.

His co-authors include Reid Alderson, a post-doctoral researcher with the Medizinische Universität Graz (MUG) who formerly did post-doctoral work at U of T; Julie Forman-Kay, a senior scientist and program head of molecular medicine at the Hospital for Sick Children and a professor of biochemistry in U of T’s Temerty Faculty of Medicine; Desika Kolaric, a research assistant at MUG; and Iva Pritišanac, an assistant professor at MUG and former post-doctoral researcher in Moses’s lab.

The team’s findings are significant because AlphaFold2 wasn’t trained to predict structures in IDRs and IDRs were not included in its training data. “It’s like AI being trained to drive a car, and then trying to see if it can also drive a bus,” says Moses. “It can’t drive the bus all that well, but it can recognize that someone should be driving.”

The team is also the first to do it systematically for all the proteins in humans and other organisms. “So, for the first time we believe we know how often it is happening,” says Moses. “This is important because biology is full of exceptions. We need to know what’s common and what’s exceptional.”

The development of this powerful and unexpected application of AlphaFold2 demonstrates the power of using AI to solve the protein folding problem and will improve researchers’ understanding of IDRs and their role in disease.

“In the IDRs that AlphaFold2 predicts to have some structure, we’ve shown that mutations are far more likely to cause disease than mutations in other structureless IDRs,” says Moses. “This is an important advance in understanding how mutations in IDRs can cause disease, which is generally not well understood. We now believe that many of the mutations are disrupting the structure somehow.

“What’s more, because AlphaFold2 predictions are already available for all proteins, now we can say for the first time how many IDRs across the tree of life have structure. Our paper shows that bacterial IDRs are much more likely to have structure than human and animal IDRs. As far as we know, this is the first time this has been noticed and it may settle the ongoing debate about whether most IDRs have structures or not.”