Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 55 · Next
Author | Message |
---|---|
sgaboinc Send message Joined: 2 Apr 14 Posts: 282 Credit: 208,966 RAC: 0 |
i've set target cpu run time to 4 hours, there is once i noted a job that ran for almost 8 hours or perhaps longer didn't track that, but in the end that job ends and it generates a single model / decoy! i've had the other extreme where in that 4 hours it generates more than 600 models/decoys. one may consider the time to complete moderately large tasks, i've seen 'large models' that generates only 2 decoys/models in a 4 hour run time (and occasionally even larger/more complex models generate only 1 model in the same 4 hours) and the extremes exceed 4 hours and go on running. my own preference for waiting is about double my set target run time and if it didn't complete, sometimes i'd terminate the task if i consider that i may not complete the run after all, as i 'crunch' on and off on occasions mainly during the night. i'd think it may sometimes be a better option as i'd hope someone else may pick up the job and complete that so that the result could be returned earlier than it waited so long that the time to expire is past an idea is based on the cpu performance, perhaps find the time needed to complete at least 1 model greater than say 95% of times (i.e. find the longest task), and perhaps more than double that duration could be consider a little 'too long' to wait. if i'd think i'd likely continue running it again soon, i'd think a better way may be to consider suspending the task and set it to run again say the next day. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Normal operations would be that a task should be consuming CPU when the BOINC Manager indicates the task is in a "running" state. The "CPU time" (seen in the task properties, not the "elapsed" time), should not exceed more than 4 hours beyond your runtime preference (6hrs is the default). At that point, the watchdog should be ending the task for you if it is still running. If the task is not getting CPU, then it is not something the task can control. The BOINC Manager allocates the CPU. Rosetta Moderator: Mod.Sense |
sgaboinc Send message Joined: 2 Apr 14 Posts: 282 Credit: 208,966 RAC: 0 |
i'm guessing that some models may perhaps be 'very complex' and thus take a very long time to run (to perhaps find even a single decoy), but the dilemma is always that would it find a useful answer or that it may after all be a 'bifurcation' e.g. the algorithm goes into a never ending loop unable to find the answer, it would be a pity if say it is running for 11 hours & for all anyone knows the next hour it may find the answer |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 440,072 Ready to send 56,012 In progress 674,016 Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month. Is that possible? |
sgaboinc Send message Joined: 2 Apr 14 Posts: 282 Credit: 208,966 RAC: 0 |
maybe the researchers/scientists need to start to send proteins that fold in to *merry x'mas & happy new year* :o :D lol |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 283,422 Ready to send 25,596 In progress 452,000 Not looking good... |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 403,009 Ready to send 15,120 In progress 125,240 While I'm aware fewer people will be running their machines over the holidays, I'm considering increasing my runtimes to 24hrs to eke out the tasks I have. I've never really been a positive thinker in these matters... |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 76,080 Ready to send 53,352 In progress 954,274 Plan going into operation tonight |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 177,014 Ready to send 69,004 In progress 772,920 Not sure if I'm helping or not |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 51,069 Ready to send 72,326 In progress 851,760 Just keeping our heads above water. I'm fully stocked with 24hr jobs atm. I'll be glad when I can cut them back. |
Steve Send message Joined: 22 Nov 15 Posts: 8 Credit: 164,345 RAC: 0 |
Hi, I'm finding that although some tasks complete OK, many more go "waiting to run" and seem to stay that way. I've aborted those that are clearly long past their deadline date but the others just sit there with varying % done and elapsed times. Is this normal? I'd have expected long-past-deadline tasks to be dropped and cleaned up by BOINC (but maybe that takes longer than a week?)Or is there something weird about my PC? Any advice would be welcome. Thanks in advance. Steve |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Hi, Waiting to run is not a flaw in a task. It simply means that the BOINC Manager has decided to run something else first. Sounds like perhaps you have several projects running and the BOINC Manager is still getting used to the mix and may have download too much work. As to the deadlines you mentioned, yes the BOINC Manager attempts to run tasks that are in risk of missing their deadlines first. And once the deadline has passed, you may as well "abort" the task. But once things settle in, this should not happen. Does your machine run BOINC on a fairly regular schedule? How many hours per day? Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
I'm finding that although some tasks complete OK, many more go "waiting to run" and seem to stay that way. I've aborted those that are clearly long past their deadline date but the others just sit there with varying % done and elapsed times. Is this normal? I'd have expected long-past-deadline tasks to be dropped and cleaned up by BOINC (but maybe that takes longer than a week?)Or is there something weird about my PC? Waiting to run only applies when other projects are prioritised ahead of Rosetta, but I notice your only other project is Malaria which has been out of tasks for some while, so I'm wondering if you have "suspend when computer is in use" checked in OptionsComputing Preferences. This should be unchecked. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A heads up Total queued jobs: 1,270,075 Ready to send 110,160 In progress 1,236,288 Looks like I can shut up now |
Steve Send message Joined: 22 Nov 15 Posts: 8 Credit: 164,345 RAC: 0 |
Hi, Understood. I should have said "seem to stay stuck on Waiting to Run with only a small percentage of work completed". I have only Rosetta running (I did run MalariaControl for a while but found it swamped BOINC such that no other tasks would start, so I set it to run no new tasks and now only Rosetta is getting work to do) This PC runs very little other work - it's my retired desktop machine, now acting as a baby fileserver and occasional test machine in my home office, hence I decided in November to run some BOINC work on it. It is set to run BOINC tasks 24 hours a day and I left it running over Xmas and New Year and today found about a dozen unfinished Waiting to Run tasks that were past their deadlines which I've now aborted. What I'm puzzled about is that BOINC is starting new tasks when older ones still are Waiting to Run, but I'm going to try some compute preference changes as suggested in another reply and see if that works better. Thanks for the response Steve |
Steve Send message Joined: 22 Nov 15 Posts: 8 Credit: 164,345 RAC: 0 |
I'm finding that although some tasks complete OK, many more go "waiting to run" and seem to stay that way. I've aborted those that are clearly long past their deadline date but the others just sit there with varying % done and elapsed times. Is this normal? I'd have expected long-past-deadline tasks to be dropped and cleaned up by BOINC (but maybe that takes longer than a week?)Or is there something weird about my PC? Thanks for the suggestion, I've not got "suspend when comouter is in use" checked but I did have "suspend GPU ... when in use" checked so I've cleared that and also allowed tasks to stay in memory when suspended so I'll see if that helps. I've also removed the dormant Malaria Control project (which I deactivated because it hogged the system) so BOINC only has one project to work on. Will see how that goes. Thanks for your response Steve |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
A few things to consider: Where did you set your preferences? Changes made in the BOINC Manager will override any web-based settings. Double check the wording. In my version of BOINC Manager a box must be checked to keep tasks running while the computer is in use while you must select the “no” radio button to achieve the same thing using web-based prefs. What I'm puzzled about is that BOINC is starting new tasks when older ones still are Waiting to Run... This can happen if there isn’t enough memory to continue running a particular task. BOINC will set that one aside and try another. Rosetta tasks are among the most memory hungry tasks you will encounter in the BOINC world. So how much memory per core do you have and, more importantly, how much is BOINC allowed to use? Could computer (not BOINC) sleep/hibernation settings be coming into play? Best, Snags |
Steve Send message Joined: 22 Nov 15 Posts: 8 Credit: 164,345 RAC: 0 |
A few things to consider: Thanks Snags - useful input. I have used local settings and the option window confirms that it's using those (it has a button to use prefs from the web but I haven't clicked that) PC is a quad core with 12GB RAM, but it's running several large java-based services so memory typically runs around 80-90% used but with very little swapping. However as I'm not using the largest of those services most days I've now stopped that (releasing around 4GB) and will only run it when I need to access it. Rosetta tasks are usually under 200MB each in task manager so that should now mean there's plenty of memory available. Making previously suggested changes seems to have improved things somewhat (only one overdue task waiting this morning) so I'll see if the latest change does any better. Best, Steve |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,158,554 RAC: 15,699 |
A few things to consider: I saw you had 12Gb RAM so didn't expect RAM to be an issue, but now I read this it is likely to have been a factor. My 8 concurrent tasks typically contribute 1.5GB out of 6.5Gb RAM in use, but I have 16Gb RAM total to utilise. |
BelgianEnthousiast Send message Joined: 25 May 15 Posts: 5 Credit: 1,023,045 RAC: 0 |
Hi All, Been running Rosetta for a while and now encountering serious issues with near-endless or endless loops. Normal running time is 6 hours on a task. And half of the WU's seem to adhere to that, however the other half is showing some weird behaviour : 1. Running forever without any estimated time left, going on for 20+ hours as an example : nkid_1_3_2016_final3_0716_00058_0043.pdb343_TG_dez_fold_SAVE_ALL_OUT_322141_663_0 nkid_1_3_2016_final3_0692_00366_0042.pdb342_TG_dez_fold_SAVE_ALL_OUT_322134_678_0 2. Running forever, but with an estimated time left which keeps creeping up. don't have examples here, I aborted them after 25+ hours of running. This appears on a laptop. On my desktop, it seems to work well. Although I have other issues there with the scheduling of Rosetta. Could you please investigate ? Many thanks in advance ! Kind Regards, B.E. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org