Message boards : Number crunching : 8 cores but only one Rosetta file?
Author | Message |
---|---|
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
I am wondering if this is a limitation in Rosetta or if there is another problem. I have an i9-9900KF Intel chip, it has 8 cores, and I can run quite few tasks. Right now it is running 8 tasks, 7 from WCG and 1 from Rosetta. My question is, why is Rosetta only get one task? My resources are set to 100%, 28.57 for three projects and 14.29% for a fourth project that is on pause.. why is WCG getting more resource share? Is that because I downloaded the manager from WCG instead of Boinc? I think it should all be the same. Or is this a Rosetta limitation, they only one a computer to work on one file at a time? EDIT: So I suspended WCG and Rosetta immediately connected and downloaded more files, however, now, Rosetta is taking over WCG and WCG projects are not running. Why is the system not balancing? I would like to run half of each, but the system seems to think that isn't the way it will work. Before, years ago, I remember I was able to allocate a percentage to each project, and it seems it is allocated, but not being followed. Help! |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
I am wondering if this is a limitation in Rosetta or if there is another problem. I have an i9-9900KF Intel chip, it has 8 cores, and I can run quite few tasks. Right now it is running 8 tasks, 7 from WCG and 1 from Rosetta. My question is, why is Rosetta only get one task? My resources are set to 100%, 28.57 for three projects and 14.29% for a fourth project that is on pause.. why is WCG getting more resource share? Is that because I downloaded the manager from WCG instead of Boinc? I think it should all be the same. Or is this a Rosetta limitation, they only one a computer to work on one file at a time? From my experience - it only sends a handful/one initially to test your computer, and sends more until it can establish your computer is a trustworthy host that does valid work. As for the rest - why it overtakes - well, that's because it takes a few days for the manager to balance. Rosetta also has shorter deadlines than most WCG projects, and the client automatically focuses on shorter deadlines. So, give it time - don't suspend tasks - just monitor it to see if it improves on its own, and let it run. You can also go to your WCG settings and set up a device profile, which can limit the amount of work it sends for each project you're signed up with. The other things you can do is go to the BOINC Manager - Computing Preferences - and reduce the backlog of work. |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
Maybe that was the problem as I had not run Rosetta on a computer for years. And just started getting back into folding. I did notice that the dates for Rosetta projects were about 3-4 days earlier than WCG projects, but that means I will end up doing more Rosetta than WCG. which is not really what I want. But... Ok.. Like I said, originally it was 1 Rosetta and like 8 WCG, so I paused WCG and then the result is ALL Rosetta now and 0 WCG, so I had to no new tasks for rosetta right now to let it finish, as Rosetta projects are notoriously slow, the time they give you is off by about 2 hours, even with a new powerful machine, it's the same as it was 5 yeas ago. Disappointing actually, but.. so be it. Credit for Rosetta projects is also quite low. But matters not. I will try not to suspend anything, I only put no new tasks for Rosetta for now, but that may influence it also? The funny thing is is that the switch between projects feature in Boinc manager doesn't work. Not one bit. I tested it, it doesn't do anything after the time I set. I actually want more WCG projects than Rosetta. Like 60/40. But it is not working at all. I have canceled a lot of Rosetta projects as it downloaded way too much. I have it set for keeping about a 2-3 days of extra work just in case for each project, I think it is enough. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,069,196 RAC: 14,319 |
Maybe that was the problem as I had not run Rosetta on a computer for years. And just started getting back into folding. If you want WCG to do more work than Rosetta then set the resource share for each project appropriately and wait for 4 or 5 days for the system to settle down. Also, Rosetta WUs are not slow, the time they take is the time you set in the project preferences (or 8 hours by default), this are time limited rather than work limited. HTH |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
I've tried to set the resources appropriately but it seems they do not work right. WCG seems no place to set this either, I must be missing it. I've never had so many cores to work with and with a high performance machine, so now I can do like 12-16 projects on an 8 core chip, but it gets way too hot, over 90C, even with just 8 running it gets to 70-90, I have to cut back. But that is a minor issue. Ahhh, so on Rosetta, I can cut back the file size they send me? I didn't know that, I was wondering what the heck that was for, now I know and now I will change it to 2-3 hours per file. Makes more sense. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,069,196 RAC: 14,319 |
I've tried to set the resources appropriately but it seems they do not work right. WCG seems no place to set this either, I must be missing it. I've never had so many cores to work with and with a high performance machine, so now I can do like 12-16 projects on an 8 core chip, but it gets way too hot, over 90C, even with just 8 running it gets to 70-90, I have to cut back. But that is a minor issue. Log into the WCG website and select :- >settings>device manager>device profile go right down to the bottom of the page and change the project weight to 150 then save (assuming that you’re doing WCG and Rosetta only and that Rosetta is still set to 100 then this will give you your 60/40 split). The main thing then is to wait, ignore one project or the other taking over the machine and it will settle down to your selected split over the next 4 or 5 days. As to the overheating, review your cooling arrangement, make sure all the fans are working and are not fighting each other. Make sure all of the filters are clean and free flowing. Make sure the heat sink fins are clear of dust. Big machines like yours are not designed to be run at quarter throttle so I’m guessing that it’s not working properly. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,503,596 RAC: 24,507 |
I've tried to set the resources appropriately but it seems they do not work right. WCG seems no place to set this either, I must be missing it. I've never had so many cores to work with and with a high performance machine, so now I can do like 12-16 projects on an 8 core chip, but it gets way too hot, over 90C, even with just 8 running it gets to 70-90, I have to cut back. But that is a minor issue.No, it's a major issue- there is something wrong with the CPU cooling for your system that you need to sort out. I'm running all cores & threads (6c/12t) in mid 30°c temperatures and the CPU is only around 70°c with the fans around 50% Ahhh, so on Rosetta, I can cut back the file size they send me? I didn't know that, I was wondering what the heck that was for, now I know and now I will change it to 2-3 hours per file. Makes more sense.Or better still since running more than one project, set a small cache size & then let things settle down over a week or 4. Since you haven't run it for a while, it will take some time for the Estimated completion times to settle down, and for your Resource share settings to be honoured. Other Store at least 0.4 days of work Store up to an additional 0.02 days of work Grant Darwin NT |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
I've tried to set the resources appropriately but it seems they do not work right. WCG seems no place to set this either, I must be missing it. I've never had so many cores to work with and with a high performance machine, so now I can do like 12-16 projects on an 8 core chip, but it gets way too hot, over 90C, even with just 8 running it gets to 70-90, I have to cut back. But that is a minor issue. Ok, yeah I figured it out, but right now it seems to be half working right without doing that. But yes, I see the place on WCG now. Thank you. What do you mean big machines like mine are not designed to run at 25% throttle? My machine has 3 intake fans in the front. One rear exhaust fan. One top exhaust fan. And a push/pull heat sink cooler master cpu fan. I think it is already quite good. If I turn on the AC in my office I can get the temps down in the low 80s. But these are max temps not sustained as I do not run my cpu 100%. The most I have done with AC on is 85% of cpu at 85% of the time. When I am not around I lower them both to between 40-50%. And when AC is off, I might go lower. Summer is coming and it gets rather hot. Inside office temp now is around 23C, in summer that will reach 30C without AC at which point I will have to stop using it or only use 10-25% load.
Ok I have messed with this, but even having set it on a 2 hour file, the size that is downloaded is still 6-8 hours per file. Perhaps it still needs more time to adjust. Also... I have noticed that Boinc is not getting my credit. I am running a GPU project, and Boinc is not added the credits, been 48 hours. WCG and Rosetta seem already added.. So not sure what is going on with the GPU project.... |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,069,196 RAC: 14,319 |
Instead of saying that your machine is not designed to run at quarter throttle I should have said that your machine is designed to run 100% without overheating. If it is overheating so badly at far less than 100% usage it is likely that there is a problem with the cooling system. The setup you describe sounds good but make sure it is not clogged up with dust for example. As to the file size, this will not change with the preferred run time. Rosetta sends a work unit, you machine will then process the work unit for, approximately, the time you have set and then send all of the results you have managed to generate back to base. The estimated time remaining will initially stay at 7-8 hours but will correct itself over the next few days as the system settles down. Boinc not getting your credits could be one of two things, either you have not given that project permission to export your credits to the account manager or, like climate prediction, that project might not generate credits on a daily basis - it could, for example, be weekly. |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
Instead of saying that your machine is not designed to run at quarter throttle I should have said that your machine is designed to run 100% without overheating. If it is overheating so badly at far less than 100% usage it is likely that there is a problem with the cooling system. The setup you describe sounds good but make sure it is not clogged up with dust for example. Hi, and thanks for the clarification. I thought maybe that is what you meant. Actually to run at 100% without getting to 90C for long periods of time I think is impossible. The 9900KF chip seems to run hotter. When nothing is running or just normal things, my system runs at 25-30C which is fantastic. I would rather it not get to around 90C though, I know the Intel chip can sustain a higher temp before it starts to throttle, but I would rather keep this system for 5-10 years as it's just new. Everything is clean. There is a small amount of dust at the bottom, but I do not clean my system every month. No build up of dust in the cpu fans or anywhere else. The case is huge too, which allows for plenty of air flow, no blockages between fan and cpu and exit. Because I am using the long Nvida GPU I needed the bigger case. Everything is quite clean, including psu filter. I will see what happens for the run time issue in the next few days. Rosetta did not download any new cases until I closed Boinc manager and reopened it. Still estimating about 5-6 hours now. We will see what happens over the next few days. The project in question is collatz, everything is being uploaded to the project, I can see all the results, but to Boinc, something is off, and it may be like you said, that it may not be reported right away. I will wait a few days. but if it ends up not being connected, I am not sure how to deal with that as every project was added the same to the manager... |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
I think I found the problem with the other project not registering with the online stats... but I don't know how to deal with it... In Boinc manager online I tried to link Collatz with the stand along pc manager. but... found username and password wrong. so I went back here, to try to change my email address, well, seems I already have an account, so.. how do i link the two together and just use the original one with all my new info? seems probably cannot, i dont know, can boinc management assist? or.. am i sol? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2114 Credit: 41,102,039 RAC: 21,505 |
I did notice that the dates for Rosetta projects were about 3-4 days earlier than WCG projects, but that means I will end up doing more Rosetta than WCG. which is not really what I want. But... Ok.. Like I said, originally it was 1 Rosetta and like 8 WCG, so I paused WCG and then the result is ALL Rosetta now and 0 WCG, so I had to no new tasks for rosetta right now to let it finish, as Rosetta projects are notoriously slow, the time they give you is off by about 2 hours, even with a new powerful machine, it's the same as it was 5 yeas ago. Disappointing actually, but.. so be it. Credit for Rosetta projects is also quite low. But matters not. Most of that's not true. If you only just added Rosetta back in, there'll be a debt to Rosetta compared to other projects, so it'll bias towards Rosetta for a while until you return some tasks. That's all. I will try not to suspend anything, I only put no new tasks for Rosetta for now, but that may influence it also? The funny thing is is that the switch between projects feature in Boinc manager doesn't work. Not one bit. I tested it, it doesn't do anything after the time I set. If you NNT Rosetta, Boinc won't see the debt being 'repaid' so will just call more Rosetta again over and over again. That's the opposite of what you want. Resource share changes also aren't instantaneous. Give it a chance - a whole <week> at least. Keeping 2-3 days of extra work also doesn't help. About 1.6 days should be the maximum to get tasks back in time for them to be useful to Rosetta. And I'm not sure about this one, but I think Resource shares would be decided based on CPU time, so once all the debt is filled, the shorter WCG tasks compared to 8hr default Rosetta tasks means you'll get twice as many WCG tasks as Rosetta, even with a 5050 resource share, with a resource share. So return Rosetta task runs to the 8hr default In short, if you did fewer adjustments you'll get what you want by default. All the adjustments you're making are doing the opposite of what you want. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2114 Credit: 41,102,039 RAC: 21,505 |
I've tried to set the resources appropriately but it seems they do not work right. WCG seems no place to set this either, I must be missing it. I've never had so many cores to work with and with a high performance machine, so now I can do like 12-16 projects on an 8 core chip, but it gets way too hot, over 90C, even with just 8 running it gets to 70-90, I have to cut back. But that is a minor issue. This is not right again. By changing the run-time down, you download the same files (as I understand it) but you reduce the results within the completed task so the file you return is smaller - where "smaller" is equivalent to "less productive and lower credit and lower debt repaid so more tasks downloaded that you say you don't want" As to the overheating, review your cooling arrangement, make sure all the fans are working and are not fighting each other. Make sure all of the filters are clean and free flowing. Make sure the heat sink fins are clear of dust. Assuming no over clocking and with the high quality case fans you have, the temperatures should be half. Are the fans set low or not set to respond to temperature somehow? I use Corsair and it has software to enable me to ensure fans are running to a better profile [Quiet/Balanced/Extreme] and the defaults were too low for me. Your cooling should definitely perform better. Also, WCG tasks always run much hotter for me. The only project that throttles my CPU. Ahhh, so on Rosetta, I can cut back the file size they send me? I didn't know that, I was wondering what the heck that was for, now I know and now I will change it to 2-3 hours per file. Makes more sense.Or better still since running more than one project, set a small cache size & then let things settle down over a week or 4. Since you haven't run it for a while, it will take some time for the Estimated completion times to settle down, and for your Resource share settings to be honoured. The task runtime will take time to adjust, but you're really doing crazy-wrong stuff here. Even if you cut down your task buffer, by reducing your task runtime you're multiplying the tasks you call down by 4 times. 0.40+0.02 = 10 hours per CPU of work. With 8 cores that's 40 x 2hr Rosetta tasks. And at 50 resource share it's 20 - plus the 8 tasks you might be running = 28 Return your runtime to 8hrs and you'll pretty much only have your 8 running tasks. And before you complete them you'll grab another 8, but with a 3-day deadline there's plenty of time to grab and run quite a few WCG tasks, which is what you prefer |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I always describe that the analysis of resource shares should be looking across a hundred hours of runtime, not a hundred minutes. As for "WU size" (one way of referring to the runtime preference that can be configured via the profile pages on the project website), all WUs are the same "size". The same number of bytes are downloaded regardless of the runtime preference. Think of a WU like a one thousand piece puzzle that must be assembled. You don't really know how long it will take you to put it together when you start. And if I time you today, and have you do it again tomorrow, there could be a fair degree of variability in how long it takes you. So the WU is sent with whatever number of pieces the puzzle requires, and your machine basically assembles the puzzle many times. As many times as it can complete within the runtime preference. Don't take my analogy too far. The analogy implies you reach the same point of completion each time. These protein models do not work that way, each model and result is different. I am simply trying to give a tangible, visualization that might help explain various observations of running WUs. Rosetta Moderator: Mod.Sense |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,503,596 RAC: 24,507 |
Hi, and thanks for the clarification. I thought maybe that is what you meant. Actually to run at 100% without getting to 90C for long periods of time I think is impossible. The 9900KF chip seems to run hotter. When nothing is running or just normal things, my system runs at 25-30C which is fantastic. I would rather it not get to around 90C though, I know the Intel chip can sustain a higher temp before it starts to throttle, but I would rather keep this system for 5-10 years as it's just new. Everything is clean.Then there is something wrong with the fitting of the heatsink to the CPU (the number of times a CPU has been fitted without the extremely thin protective bit of plastic being removed..., or too much heatsink compound has been applied...). Even a poor heatsink with a half decent fan should be able to keep that CPU to 80°c or below with it working 100% on all cores & threads. What is the ambient temperature there? Even with that at around 40°c a very good air cooler would keep that CPU around 80°c, even a basic water cooler likewise. A good water cooler, 70°c even with high ambient temperatures (35°+). Yes, the 9900 series run hot- they are clocked to within an inch of their life. But with halfway decent cooling it shouldn't get any where near 90°c. Grant Darwin NT |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
To all because too many to quote now. Thank you for your responses. The way the Asus MB deals with temps now is different than before. Frankly I hate the new AI software it provides as there is no way to manually adjust fan speed. What happens is that as temperature goes up, the fans automatically will pulse in for a few seconds and then back off as the temps drop, and so on. What I notice is that if I use the AC in my office, and keep the AC at around 25/26 the the temps stay lower, but if I switch off the AC the machine gets really hot. I am not overclocking my chip at all, I feel it is fast enough. But running CPU plus GPU together it creates a lot of heat. The fans work fine, I can feel very hot air being pushed out of the system. I have tried to also create almost a foot behind my machine of empty space to allow for heat dissappation. I just cleaned out my machine, a bit of dust in the heat sink. I then sealed off all intake holes with 3m filtrate to keep the dust and pet hair out of my system. So where is what I noticed. Someone was right, my system is designed to be run at full, not at half, the temps go up if I run 25% 50% and even 75%, problem is that I am often actually using the system, so during times I am using it Bonic stops working. the only way to stop that from happening is by limiting how much work is being done while I am using it, but then the temps go up. I had it running running at 95% CPUs and 95% time and the temps did not go over 80C, might have been a spike or to, but not so bad. By running it less than that, the temps would go up to 80-85C. Part of the problem is the way the fans are managed by ASUS. They only increase speeds when the temps go up, a balance between cooling and noise. But this creates other problems such as off/on fan speed and heat spikes. Things get quieter and temps stay down when I run everything more. It is so odd. fans run at quiet speed and temps stay under 80C at 95%. Averages are around 75C, occasionally will get a spike. But again this is with AC at 25C. Mainboard temp runs around 37-40C with AC on. GPU is running around 64C. GPU is Geforce RTX 2060. I am not sure there is a glue problem or not. I selected all the components and had an assembler put it together. They are familiar with how much glue to use, etc. But perhaps there is still a problem, I dont know. Trying to attach photos here of Cpuid. but... not sure how to get the image here.[/img] |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,503,596 RAC: 24,507 |
The way the Asus MB deals with temps now is different than before. Frankly I hate the new AI software it provides as there is no way to manually adjust fan speed. What happens is that as temperature goes up, the fans automatically will pulse in for a few seconds and then back off as the temps drop, and so on.I had similar issues with my Gigabyte motherboard. In the BIOS fan settings was an option called Fan interval- with 1,2 or 3 as options. Selecting 3 sorted it out. Basically it was a delay in changing the fan speed. With 3 seconds delay it wouldn't respond to short sharp changes in temperature, but it would respond to changes that lasted for more than 3 seconds. CPU kept cool, Fan speeds changed as needed but not moving up and down continually. I'd expect something similar in your board's BIOS. Trying to attach photos here of Cpuid. but... not sure how to get the image here.You need to have the photos on a hosting site, then link to the images there. Grant Darwin NT |
Jeff Send message Joined: 19 Jan 06 Posts: 17 Credit: 36,174 RAC: 0 |
So basically my machine is operating normally.... the fan speed does change, you can hear it rev up, but when it revs it's like a wave. It's annoying. I worry all that surge in power continually over the day and months and years is going to take its toll on my hardware. I don't know, am I wrong here? It runs normally, quiet for like 5 seconds, temps go up, revs to high speed for 1 sec, then drops down again and temps fall. So cpu temps do get to 80+ for 1sec then drops to 50-60-70C range. I assume it is operating normally and there is not much else I can do. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Trying to attach photos here of Cpuid. but... not sure how to get the image here. See the sample img tag in the description of BBCodes. Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2114 Credit: 41,102,039 RAC: 21,505 |
I had it running running at 95% CPUs and 95% time and the temps did not go over 80C, might have been a spike or to, but not so bad. By running it less than that, the temps would go up to 80-85C. Part of the problem is the way the fans are managed by ASUS. They only increase speeds when the temps go up, a balance between cooling and noise. But this creates other problems such as off/on fan speed and heat spikes. Problems arise in task completion if CPU Time is not at 100%. Knock the running CPUs down a bit if you like to compensate, but you seem to have a margin now. I had similar issues with my Gigabyte motherboard. In the BIOS fan settings was an option called Fan interval- with 1,2 or 3 as options. Agreed - there'll be something in there. Personally I keep mine on extreme and all the time, but I'm in a noisy environment so it doesn't bother me |
Message boards :
Number crunching :
8 cores but only one Rosetta file?
©2024 University of Washington
https://www.bakerlab.org