Rosetta stops crunching

Message boards : Number crunching : Rosetta stops crunching

To post messages, you must log in.

AuthorMessage
fjpod

Send message
Joined: 9 Nov 07
Posts: 17
Credit: 2,201,029
RAC: 0
Message 71336 - Posted: 26 Sep 2011, 22:57:33 UTC

Is it just me?? For the past week, one of my computers (dual core) would spontaneously stop crunching Rosetta on one core only. The timer keeps going way up, but when the work isn't getting done. When you look in task manager, you can see that 50% of the cores are idle. At first I thought something was going wrong with my computer/CPU, but now it is also happening on one of my other computers (Q6700).

Is there something wrong with the WUs?
ID: 71336 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,135,082
RAC: 4,703
Message 71339 - Posted: 27 Sep 2011, 11:49:25 UTC - in response to Message 71336.  

Is it just me?? For the past week, one of my computers (dual core) would spontaneously stop crunching Rosetta on one core only. The timer keeps going way up, but when the work isn't getting done. When you look in task manager, you can see that 50% of the cores are idle. At first I thought something was going wrong with my computer/CPU, but now it is also happening on one of my other computers (Q6700).

Is there something wrong with the WUs?


No it is more likely a setting in the Boinc Client...I see you are using the 6.12.?? version in at least one of your pc's, go into the Boinc Manager, down by the clock, and open Tools, then Computing preferences, click on the processor usage tab and see if there is a number other than zero on the line that says "While processor usage is less than [_] percent (0 means not restriction)". If there is a number change it to a zero, the default is 25. This will tell your pc that no matter what else is running continue running Boinc at the low priority setting. With the default of 25 in the box it means that when your cpu usage hits 25% remaining, or unused, stop crunching Boinc until it returns above that. This setting has been in Boinc for quite a while now so maybe your pc is just busier doing other things lately. REMEMBER to click OK at the bottom to actually accept and make the change. This is a pc by pc change, it is not a global one.
ID: 71339 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fjpod

Send message
Joined: 9 Nov 07
Posts: 17
Credit: 2,201,029
RAC: 0
Message 71343 - Posted: 27 Sep 2011, 17:31:10 UTC

I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks.
ID: 71343 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fjpod

Send message
Joined: 9 Nov 07
Posts: 17
Credit: 2,201,029
RAC: 0
Message 71344 - Posted: 27 Sep 2011, 17:34:40 UTC

OOPS...I was already running on zero restriction, so that can't be it...and usually if cpus stop due to this restriction, they all stop, but in my case, only one or two are stopping. The only way to get them going again is to shut down Boinc and restart it. In the case of busy cpus, Boinc manager notifys you that cpus are suspended...but not in my case. The countdown clock keeps going as if nothing is wrong.

Anybody else seeing this?
ID: 71344 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 71345 - Posted: 27 Sep 2011, 17:57:32 UTC

This problem has been mentioned before (in the Mini Rosetta 3.14 thread I believe). It seems to happen to me only on W7 and only on tasks with names starting with T followed by a digit number.

A workaround is to quit and restart BOINC.
ID: 71345 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,135,082
RAC: 4,703
Message 71349 - Posted: 28 Sep 2011, 12:04:30 UTC - in response to Message 71343.  

I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks.


50 or 60 would have been going the wrong way, you would want 10 or 15 to have LESS stoppages. It is kind of reverse thinking...when the pc hits that percentage of unused cpu it stops Boinc, so at 25% 76% of the cpu can be doing other things, but at 50% only 50% of the cpu can be doing other things before Boinc gets stopped.

But I am sorry that is not the problem!!
ID: 71349 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,526,853
RAC: 9,592
Message 71352 - Posted: 29 Sep 2011, 16:01:29 UTC - in response to Message 71349.  

I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks.


50 or 60 would have been going the wrong way, you would want 10 or 15 to have LESS stoppages. It is kind of reverse thinking...when the pc hits that percentage of unused cpu it stops Boinc, so at 25% 76% of the cpu can be doing other things, but at 50% only 50% of the cpu can be doing other things before Boinc gets stopped.

But I am sorry that is not the problem!!

I don't think that's right Mikey - BOINC is allowed to run when other CPU usage is less than that value, so a higher value means BOINC can run while more other stuff is running. I am assuming it does what it says though - I haven't tested it!
ID: 71352 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,135,082
RAC: 4,703
Message 71353 - Posted: 30 Sep 2011, 11:43:40 UTC - in response to Message 71352.  

I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks.


50 or 60 would have been going the wrong way, you would want 10 or 15 to have LESS stoppages. It is kind of reverse thinking...when the pc hits that percentage of unused cpu it stops Boinc, so at 25% 76% of the cpu can be doing other things, but at 50% only 50% of the cpu can be doing other things before Boinc gets stopped.

But I am sorry that is not the problem!!


I don't think that's right Mikey - BOINC is allowed to run when other CPU usage is less than that value, so a higher value means BOINC can run while more other stuff is running. I am assuming it does what it says though - I haven't tested it!


You could be right, I always put mine at zero and just let it run.
ID: 71353 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 71359 - Posted: 1 Oct 2011, 15:11:17 UTC

Yes there is still some quirk with the Boinc manager where it thinks a task is active, but no CPU time is allocated to it. Yes, a complete exit of BOINC and restart seems to generally be the best resolution.
Rosetta Moderator: Mod.Sense
ID: 71359 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fjpod

Send message
Joined: 9 Nov 07
Posts: 17
Credit: 2,201,029
RAC: 0
Message 71364 - Posted: 3 Oct 2011, 23:30:53 UTC

good to know that I am not the only one to notice this. I was beginning to think something was wrong with my hardware. I think the WUs are defective, because once a batch gets processed, the next batch will be OK.

...and the right way to allow more cpu use is to raise the number from 10 to 20 to...80, and finally 00. The 00 really should be 100 (%) useage. The 00 is really a misnomer and counter-intuitive...but hey, who's complaining.
ID: 71364 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Rosetta stops crunching



©2024 University of Washington
https://www.bakerlab.org