Deep Learning AI and Protein Folding

Please confine these topics to things that would be of general interest to those who are interested in FAH which don't fall into any other category.

Moderator: Site Moderators

Post Reply
rbpeake
Posts: 142
Joined: Sun Jun 15, 2008 4:39 pm
Hardware configuration: Intel® Core™ 2 Duo processor E8500, dual 3.16GHz cores, 6MB L2 Cache, 1333MHz FSB (45nm); 4096MB Corsair™ XMS2 DDR2-800 RAM; 256MB eVGA™ NVIDIA® GeForce™ 8600 GT Video Card
Location: NYC Metro Area

Deep Learning AI and Protein Folding

Post by rbpeake »

Interesting article in Nature regarding how deep learning AI techniques are progressing with regard to protein structure prediction.

https://www.nature.com/articles/d41586- ... 3-44211845
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Deep Learning AI and Protein Folding

Post by bruce »

FoldingAtHome does not attempt to predict the protein structure. The "predicted" shape may be the most probable one and it won't be the shape associated with a disease.

Our objective is to learn about protein misfolding -- which means we determine any OTHER shape that may happen, no matter how improbable -- and then proceed to determine WHY those alternate shapes happen, how they are associated with disease, and what might be done about them.
justinrporter
Pande Group Member
Posts: 81
Joined: Fri Jan 06, 2017 9:15 pm

Re: Deep Learning AI and Protein Folding

Post by justinrporter »

Bruce said it well, but I'll add a bit of detail: generally, the AI techniques have been most successful in the pure "protein folding problem," which is we could re-frame as "predict the positions of atoms in an x-ray crystal structure give the sequence." In other words, given a sequence of amino acids, return a list of x, y, and z positions for all the atoms for the conformation that results when the protein molecules are (usually) drenched in salt, (usually) highly concentrated, (usually) packed into a crystal, (usually) frozen, and (usually) blasted with x-rays.*

Most proteins don't take on just one conformation. In fact, because of the amount of energy that's available at room temperature, I would be tempted to argue actually that _no protein_ has just one conformation. Proteins are subject to what Feynman called "the wiggling and jiggling of atoms," because room temperature is pretty chaotic at the length scales we're thinking about here. I actually just saw a talk a few days ago where the speaker argued that, because of how flexible proteins are, no two proteins have ever adopted the same conformation in the history of the universe. It's thought-provoking stuff!

And it gets even "worse." One of my favorite proteins, myosin, has a cycle in which it uses a lever arm to pull on a track, detaches from the track, resets the lever arm, and then reattaches to the track. It's important for, amongst many other things, muscle contraction. So with that in mind, it might not surprise you to know that the protein has at least four states that show up in crystal structures, depending on how you make the crystal! So the question "what is what is the folded conformation of myosin?" is a bit like asking "which single letter is most representative of the alphabet?"... in a meaningful way, it's not even a well-formed question.

So while there really IS a ton of value in being able to predict "the structure" (to the extent there is one) from the sequence--lots of our F@H projects actually start with an experimental x-ray crystal structure and are trying to find out what's "nearby" and how quickly those states can be reached and interconvert with one another--it's not like once we can use AI to predict a folded structure, we'll all just pack things up and go home because protein biophysics is solved.

*There are also other techniques for structure determination, like NMR and CryoEM, and there's quite a lot of diversity of methodology within each of those approaches.
rbpeake
Posts: 142
Joined: Sun Jun 15, 2008 4:39 pm
Hardware configuration: Intel® Core™ 2 Duo processor E8500, dual 3.16GHz cores, 6MB L2 Cache, 1333MHz FSB (45nm); 4096MB Corsair™ XMS2 DDR2-800 RAM; 256MB eVGA™ NVIDIA® GeForce™ 8600 GT Video Card
Location: NYC Metro Area

Re: Deep Learning AI and Protein Folding

Post by rbpeake »

Thanks so much for the detailed description! It's fascinating!
rafwiewiora
Scientist
Posts: 167
Joined: Mon Aug 03, 2015 8:23 pm
Location: New York

Re: Deep Learning AI and Protein Folding

Post by rafwiewiora »

Also note that work on using machine learning to sample from multiple conformations is progressing nicely, e.g. from Frank Noe - generative Markov state models: https://arxiv.org/abs/1805.07601 // Boltzmann generators: https://arxiv.org/abs/1812.01729 -- the future's looking like we'll be able to run a smaller amount of MD on F@h, train a model and get more conformations from it than we saw in the limited amount of MD -- that will help so much with the next big goal we'd like to tackle with F@h -- rather than looking at individual interesting proteins, if I can run much less simulation per protein, looking at whole protein families becomes possible. That opens up a lot of potential knowledge about protein evolution and understanding disease in a much broader context.

You can think of proteins in these levels of 'complexity': sequence --> structure --> dynamics --> function. The article you posted tackles the first arrow, the ones I posted are starting to get at the second one. In the perfect future we'll have a model that takes a protein sequence, tells me multiple shapes of the protein, how they're related to what that protein does, and what happens to those shapes when e.g. a cancer mutation is introduced. It's probably coming faster than we all think. :)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Deep Learning AI and Protein Folding

Post by bruce »

When you look at the advertising that FAH puts out, it emphases the possibility of developing science that leads to curing specific diseases, which is directly related to what rafel is calling "function" FAH can, in fact evaluate function by brute force -- and that's the way FAH started out -- but it's EXTREMELY efficient to devote a portion of FAH's resources to improving the efficiency of the intermediary steps. If you look at the list of published papers, you'll find a lot are devoted to improvments in the mathematical processes that get to function more efficiently. That certainly does not negate the publication of research papers regarding specific proteins and their interactions/functions, but FAH's future looks very bright in terms of how quickly it can get to future functional results.

AI hardware helps at one point and it's likely to be able to coordinate with FAH's research so that improvements can be made to the overall research capabilities as well as functional results.
rbpeake
Posts: 142
Joined: Sun Jun 15, 2008 4:39 pm
Hardware configuration: Intel® Core™ 2 Duo processor E8500, dual 3.16GHz cores, 6MB L2 Cache, 1333MHz FSB (45nm); 4096MB Corsair™ XMS2 DDR2-800 RAM; 256MB eVGA™ NVIDIA® GeForce™ 8600 GT Video Card
Location: NYC Metro Area

Re: Deep Learning AI and Protein Folding

Post by rbpeake »

A related Opinion piece, with an odd conclusion.
https://www.theguardian.com/commentisfr ... eaningless
Post Reply