Project 11715

Moderators: Site Moderators, PandeGroup

Project 11715

Postby boboviz » Thu Sep 06, 2018 8:36 am

This project is generating a test set for what we hope will increase the power of F@h by orders of magnitude


Seems to be an ambitious project.
This project may be "trasversal" to all simulations/projects?? All simulation will have advancements?
boboviz
 
Posts: 5
Joined: Sun Jun 03, 2018 8:27 pm

Re: Project 11715

Postby rafwiewiora » Fri Sep 07, 2018 12:27 am

Well ambitious maybe in the results, but shouldn't be anything particularly hard to implement, theory is all there - see e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3637129/

We'll use it for most projects I anticipate, but not all. The adaptive schemes need a goal function - you have to choose what you want to explore most - there will be cases when we just want to go completely unbiased still.
rafwiewiora
 
Posts: 77
Joined: Mon Aug 03, 2015 8:23 pm
Location: New York

Re: Project 11715

Postby Nert » Fri Sep 07, 2018 4:43 pm

I hope you will be able to keep us informed about the progress of this project. The prospect of "orders of magnitude" improvements in processing efficiency is exciting.
Nert
 
Posts: 140
Joined: Wed Mar 26, 2014 7:46 pm

Re: Project 11715

Postby rafwiewiora » Fri Sep 07, 2018 6:52 pm

Will do for sure, should get to first trials of this in Oct, Nov maybe we'll have something on BETA.
rafwiewiora
 
Posts: 77
Joined: Mon Aug 03, 2015 8:23 pm
Location: New York

Re: Project 11715

Postby boboviz » Mon Sep 10, 2018 12:31 pm

rafwiewiora wrote:Will do for sure, should get to first trials of this in Oct, Nov maybe we'll have something on BETA.


Great!! Thank you
boboviz
 
Posts: 5
Joined: Sun Jun 03, 2018 8:27 pm

Re: Project 11715

Postby JimF » Tue Sep 11, 2018 4:07 pm

These are all good questions, but I am wondering about the hardware implications. That is, will you do more complex simulations with the present ratio of GPU to CPU work units, or will you shift over to CPU? I would think that you could have a hard time keeping all the GPU's busy. It is a pleasant problem to have, but crunchers will need to adjust accordingly.
JimF
 
Posts: 450
Joined: Thu Jan 21, 2010 2:03 pm

Re: Project 11715

Postby bruce » Wed Sep 12, 2018 4:11 am

FAH has repeatedly stated that the project quantity of research will continue to grow faster than hardware can grow. Sort of a Moore's law for protein research.

Those who predict the hardware version cannot continue have been repeatedly proven wrong.
bruce
 
Posts: 21547
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 11715

Postby JimF » Wed Sep 12, 2018 3:39 pm

That is the past. We are dealing with order of magnitude here, not the usual 50% improvement each year. I expect they will face some policy decisions on how to handle it.
Or, to put it another way, can they increase the complexity of the projects to match their new capability? I am sure they are wondering that themselves.
JimF
 
Posts: 450
Joined: Thu Jan 21, 2010 2:03 pm

Re: Project 11715

Postby rafwiewiora » Wed Sep 12, 2018 8:57 pm

I have been wondering about that a bit indeed - no good answers until we see how this works in practice though. The crucial thing here is that we're not only increasing the computational power this way, but (in fact this is the more important motivation for me) we also will drastically lower the amount of data we have to collect to answer the same questions. The computational power already is far beyond our still not automated (we're pushing on that front too) data analysis speed and it's a bottleneck. So my hope is that a single researcher will be able to also address an order of magnitude (say a protein family vs. a single protein) more scientific questions and the computational power will be fully used. Will know more after first experiments!
rafwiewiora
 
Posts: 77
Joined: Mon Aug 03, 2015 8:23 pm
Location: New York

Re: Project 11715

Postby tcaud » Sun Sep 16, 2018 10:12 am

rafwiewiora wrote:I have been wondering about that a bit indeed - no good answers until we see how this works in practice though. The crucial thing here is that we're not only increasing the computational power this way, but (in fact this is the more important motivation for me) we also will drastically lower the amount of data we have to collect to answer the same questions. The computational power already is far beyond our still not automated (we're pushing on that front too) data analysis speed and it's a bottleneck. So my hope is that a single researcher will be able to also address an order of magnitude (say a protein family vs. a single protein) more scientific questions and the computational power will be fully used. Will know more after first experiments!


I don't think I understand. You mean there is a surplus of computation being done relative to the ability to sort/make sense of the data? Similar to how the planet is warming because there is more CO2 being produced than the plantlife can absorb.
tcaud
 
Posts: 14
Joined: Sat Aug 18, 2018 7:31 am

Re: Project 11715

Postby bruce » Sun Sep 16, 2018 8:39 pm

I look at it this way:
1) The limitations of available computation are profound, compared to the quantity of proteins that need more study.
2) High priority projects that can be run on specific hardware configurations [/i]SHOULD[/i] always be available but sometimes WUs from lower priority projects are distributed because of temporary server or project downtime. Those lower priority projects do get used to broaden the understanding of projects which have reached some initial minimum number of WUs.
3) From time to time, project suspensions do happen due to a number of potential reasons.
_ (a) The minimum number of WUs has been reached for a project but the scientist needs to reduce/study those data before deciding how to proceed -- often involving coordination and review by others. **
_ (b) An error has been discovered and must be fixed before wasting resources that are needed by other projects.
_ (c) While FAH uses data-center quality server hardware and server management methodology, issues do come up and we try to fix them a quickly as possible.
_(d) etc.

** Note: Scientists, too, do have a life and must balance personal dedication to each specific project with other demands on their time. I'm sure that figuring out more efficient methods of data reduction would make them very, very happy.

(Personal note: The annual CO2 production associated with my participation in FAH is essentially zero. Solar panels on my home produce enough power that my annual electric bill is zero, give or take some small amount.)
bruce
 
Posts: 21547
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 11715

Postby tcaud » Sun Sep 16, 2018 10:13 pm

I'm a bit surprised. It sounds like the researchers are doing this almost haphazard, like the pharma scientists who literally just create drug inhibitor chemicals and do trials with them to see if it works and leads to less resistance (this is what a neurologist explained to her class (I was in it)). I would expect that the planning would be a little more... meticulous? Are they producing a large series of slightly varied proteins based on hunches/speculations and trying them out in simulated cells?
tcaud
 
Posts: 14
Joined: Sat Aug 18, 2018 7:31 am

Re: Project 11715

Postby JimF » Mon Sep 17, 2018 5:26 pm

tcaud wrote:I'm a bit surprised. It sounds like the researchers are doing this almost haphazard,

Not at all, insofar as I can see. It is just research. If it were production, it would be done in-house by a pharmaceutical company for proprietary reasons.
JimF
 
Posts: 450
Joined: Thu Jan 21, 2010 2:03 pm

Re: Project 11715

Postby Joe_H » Mon Sep 17, 2018 6:16 pm

The Markhov state modeling being used is a statistical process, so there is a certain amount of randomness involved. Various starting states are selected and the trajectory over time modeled to find the minimum energy states and give statistics on how frequently those states are reached by all of the different trajectories. Not all possible starting states are used. As I understand it, there are some heuristics on choosing starting states, and others result in physically impossible conformations.

From what I have read about the approach being used here, the goal is to find methods that can be used to look at a short run of a single trajectory and determine its likelihood of having scientifically "interesting" results if continued further along the timeline. By concentrating on those higher likelihood trajectories, hopefully more useful data is created with less total computational time. But that carries with it some risk of missing some minimum energy states that would possibly have been detected on trajectories not selected for continuation.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 4178
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: Project 11715

Postby bruce » Mon Sep 17, 2018 7:33 pm

tcaud wrote:I'm a bit surprised. It sounds like the researchers are doing this almost haphazard,

Not at all. The calculations required to model ever possible state of even a single protein and identify how it can mis-fold would take more that a lifetime. The key words in the paper mentioned above are "modeling of most biologically relevant systems and timescales [are} intractable."

The MSM methodology being discussed allows scientists to concentrate the significant events into multiiple short (i.e.- practical) studies while avoiding years and years of non-productive modelling. This mathematical study is focusing on enhancing that concentration without missing events that would be encountered in intractably long studies. Mathematical research is enhancing biological research.

Astronomers have a similar problem if they want to observe supernovas that only happen once in millions and millions of years.

At the molecular level, motion in a fluid is random. Read https://en.wikipedia.org/wiki/Brownian_motion
Within a single protein being studied by FAH (i.e.- at the atomic level) , nothing can be studied without considering statistics and a whole series of events must happen before the disease can develop. Would we happen to be watching when those events happened? Not likely. We would spend a lot of time seeing healthy biological processes before we actually observed bad events happening.
bruce
 
Posts: 21547
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Next

Return to The Science of FAH -- questions/answers

Who is online

Users browsing this forum: No registered users and 1 guest

cron