Message boards : Number crunching : jgSP_01 tasks
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
What is the run time on these tasks? I've got some that are running for 8hrs with more than a day left to complete (estimated by BOINC) and some that are 6 hours running with 6-7 hours left to completion. I find it very odd that Rosetta tasks would take up to 2-2.5 days to complete. That is the equivalent of PrimeGrid Genfer task that takes up to 4 days running on a GPU or Einstein Gravitational wave search which takes a week. The CPU usage differs on Rosetta on these tasks. Some take 27% and some take 32%. So what is the norm for these tasks, mainly in run time? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1679 Credit: 17,797,029 RAC: 22,502 |
So what is the norm for these tasks, mainly in run time?That depends on your system. On a lightly used system the Run time & the Target CPU time are pretty much the same- the default being 8 hours (look at my systems to see what their CPU time & Runtimes are- generally within 5 minutes or less of each other). However your system is extremely over committed. You've changed your Target CPU time from the default 8 hours to 4 hours. Yet it takes your system 11 hours to do 4 hours worth of work!!! (On Universe at home it takes you 19 hours to do 5 hours of work). pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_2xf5nv7n_1389864_1_0 Name pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_2xf5nv7n_1389864_1_0 Run time 10 hours 55 min 44 sec CPU time 3 hours 56 min 47 sec I suggest you have a look at Task Manager & see just what it is that's using all of your CPU time- As you are using your GPUs on other projects then you need to reserve a CPU core/thread for each Task you are running on the GPU for each project (app_info.xml for each GPU project is the way to go- that way each GPU Tasks will have a CPU/core thread to support it. If you run out of GPU work at some stage, then those cores/threads will then be able to do CPU work, until more GPU work comes along). That will improve your GPU processing times, and it will improve the CPU processing times massively for all of the projects that are using the CPU to process work because the CPU won't be trying to process a CPU Task & support a GPU Task at the same time on a single core/thread. 1 core/thread- 1 running process. Put 2 processes on a single core/thread, and neither of them will run well at all. Edit- a quick look at your Projects shows Moo! Wrapper in particular uses 1 CPU core/thread for each GPU Task that is running. If you reserve 1 core/thread for each GPU Task (same for any other project that needs a full CPU core/thread) then things should improve a huge amount for all CPU projects (as well as improve your GPU processing times). For projects where the CPU time for processing a GPU Task is much less, reserve say 0.5 (or even 0.25) cores/threads for each GPU Task that is running and see how things go then. NB- you would also be much better off with no cache (say 0.0 days & 0.01 additional days). If you only connect to the net every few days, or you're only working on 1 Project & it spends a lot of time down, then you need a cache i you want to keep your system busy. But if you are processing more than 1 Project, having a cache just isn't necessary- if one Project is out of work, the others will get more processing time. When that down Project comes back, it'll get extra processing time to make up for the down time & to meet your Resource Share settings. And the smaller the cache, the sooner your Resource Share settings can be met. Grant Darwin NT |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
So what is the norm for these tasks, mainly in run time?That depends on your system. Interesting. I have a big cache because I was trying to get a LHC ATLAS task to load and I forgot to set the rest of things to no new tasks. I will drop my extra days work to .25 after I clear out this huge backlog. It's been a long time since I tweaked anything on this project, so I have no idea when I put the processing time to 4 hours. In my BOINC manager preferences I had a 8 hour window for processing. I thought this overrode the project preferences? I haven't run Moo in awhile and I don't always pay attention to my system. I work long hours and I can do my web surfing by GSM so I don't always check my system each night to see whats going on in BOINC world. Also what you don't see in FAH running. Yeah my system is working hard, but that's what I bought all this gear for. Work it to the max. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1679 Credit: 17,797,029 RAC: 22,502 |
Interesting. I have a big cache because I was trying to get a LHC ATLAS task to load and I forgot to set the rest of things to no new tasks. I will drop my extra days work to .25 after I clear out this huge backlog.Seriously, 0 + 0.1 will be much better- and the backlog won't clear as it will keep loading up work according to your present cache settings. Set it to 0 + 0.01 to let the backlog clear out & then see how it goes after that. In my BOINC manager preferences I had a 8 hour window for processing. I thought this overrode the project preferences?Local preferences override web based preferences. But there are no local preferences for Project settings, so i'm not sure what you mean by an 8 hour window for processing? The only local settings for for BOINC as a whole- setting it to 8 hours mean there is only 8 hours out of 24 BOINC can process work. So that probably explains how the backlog occurred- you had a large cache setting, then reduced the time BOINC had to process work by 2/3. Clear that 8 hour limit and let it run as long as the system is running. It will help clear the present backlog, and reducing the cache will stop it for building up again (unless you re-limit the number of hours BOINC has to do work). The Target CPU time settings are in the Rosetta project options in your Rosetta account page. I haven't run Moo in awhile and I don't always pay attention to my system.Moo is your main BOINC Project, you were returning work as of a few days ago. Given your Resource Share settings i expect it'll be downloading new work once the current batch for other Projects is done. If you want to do less work for Moo, reduce it's Resource Share value significantly. It will also reduce it's impact on all your other BOINC projects & FAH (even more so if you reserve a CPU core/thread to support the GPU work). I would still advise reserving at least 1 CPU core/thread for every 4 GPU Tasks that run on your BOINC other projects. Also what you don't see in FAH running.That would explain the excessive Run times for your CPU projects. How many cores/threads are being used by FAH (i'd expect 1 for each GPU Task that is running + any CPU Tasks that are running)? If it's 4, then in your BOINC account Computing Preferences (or on the local system), under Usage limits, set "Use at most xxx % of the CPUs" to 75% (or whatever the actual value is for the number of cores/threads being used by FAH). That will leave 4 threads for FAH, and 12 for BOINC projects. It should improve not only your BOINC output but FAH as well as they won't be trying to do 2 things on 1 core/thread. Yeah my system is working hard, but that's what I bought all this gear for. Work it to the maxThat's not using something to the max, it's just over committing it. Using it to the max, is when it's putting out the maximum possible amount of work. With CPU cores/threads trying to do more than one at the same time thing it's certainly working hard, but it's actual output is very much reduced (11 hours to do 5 hours of work isn't very good IMHO...). Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,191,672 RAC: 9,787 |
What is the run time on these tasks? I don't know the answer to this because I've been unable to pay much attention the last few weeks now I'm back at work, but I did notice some tasks running over 8hrs, in spite of regular checkpoints, then when opening up the graphics window I saw they had completed 0 models and some hundred steps, so I left them to run and they completed properly after around 12hrs total, meaning before the watchdog cut in, and validating ok. Unfortunately, I didn't pay attention to the task name, so I don't know if they were jgSP tasks or not |
Message boards :
Number crunching :
jgSP_01 tasks
©2024 University of Washington
https://www.bakerlab.org