Message boards : Number crunching : Credit granted pending stack
Author | Message |
---|---|
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
All my results from units completed today show pending in credit granted. Oldest over 12 hours ago. Is there a problem? |
Scott Send message Joined: 11 Apr 16 Posts: 3 Credit: 12,136,183 RAC: 0 |
Having the same problem. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
Same here. - not validated - problems with download (android request). I returned yesterday to rosie, but there are some problems... :-( |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
A.O.K. 2day |
sinspin Send message Joined: 30 Jan 06 Posts: 29 Credit: 6,574,585 RAC: 0 |
Same here with all results of one machine. The results from today, from my other machine have got a crazy validation error : Too many error results Too many total results |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
Stacking up pending credit again...anybody know why this happens? All servers show green. |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
I guess is that the processors are overloaded with all the android / smartphone..... stuff ..... that the project has put in play. This change does seem to have created havoc. It also might cause a bit of end user shift over to other projects which are not in overwhelmed mode. I suspect it is all of a piece at Rosetta 1) The explosion of credit pendings. 2) The frequent 24 hour back offs when trying to report work 3) The lack of available work for the very large 'old style' user workstations. For me the approach has been a change currently for 'no new work' with Rosetta and a shift over to WorldGrid, PrimeGrid, and POEM. (Projects I have long worked with). For me, I support an array of projects and when one goes into moribund mode and then dies (as Spinhenge did some years ago), or one makes a choice to redirect out of the shared computing world (as Malaria has done this year), or one chooses to try to tap a different universe of processing power thus resulting is serious issues for old school users (which is where Rosetta appears to be at the moment), I get to shift to other projects. Stacking up pending credit again...anybody know why this happens? All servers show green. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
I guess is that the processors are overloaded with all the android / smartphone..... These are the cpus of db server. 9y old 4 core cpu, very old and not so powerful.... |
krypton Volunteer moderator Project developer Project scientist Send message Joined: 16 Nov 11 Posts: 108 Credit: 2,164,309 RAC: 0 |
Thanks for the reports! We've restarted the project this morning. |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
I suspect periodic restarts help - though it seems that the underlying issue might simply be the serious stress the new focus on Android devices is causing. I note that pendings are building up again. Also, a problem which has existed for some time is that when work units don't report (due to the server being too busy), some setting on the server (and not the client side), *immediately* sends back a signal to have the client wait *24 hours* before reporting again. This second issue *is* project specific. If one is attending to the client workstation, a *manual* update can be pushed and it will almost always succeed, but the idea is not the force manual intervention. Again, I have only encountered this *automatic* 24 hour deferral on this project. Note, I've been an active participant with this project now for nearly 10 years and this past several months have been...... challenging. Thanks for the reports! We've restarted the project this morning. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Other than a single message that mentions Android when the project is unable to deliver work to your machine, why do you believe that these devices are sucking the life out of the servers? And if you believe that, why do you believe also say that it's a waste of time to support them as crunchers? Aren't there stats sites with accurate data somewhere that show the breakout of host machines by type? I believe it is just as simple as 2-3,000 new hosts per day for two months is one heck of a lot of mouths to feed, and there are some growing pains here as adjustments are made. These are PC hosts coming primarily from Charity Engine, which graciously attaches and serves R@h tasks when they are short of tasks themselves. If you hit up the highest host numbers and see the host name starts with a "ce", these are coming from Charity Engine. Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2117 Credit: 41,157,280 RAC: 15,934 |
Other than a single message that mentions Android when the project is unable to deliver work to your machine, why do you believe that these devices are sucking the life out of the servers? And if you believe that, why do you believe also say that it's a waste of time to support them as crunchers? Aren't there stats sites with accurate data somewhere that show the breakout of host machines by type? Agreed. It only mentions android tasks because that's all there is - and recognises they're no good for a PC. The 2 major issues imo are: - the ability to make sufficient work for download. 2.5m queued but only 50k available for download, which seems a pretty constant figure, so I assume it's the stock of available android tasks. PC tasks are just constantly wiped out the moment they're created. I bet there's a hefty backlog in everyone's buffer - the immediate 24hr back-off when no tasks are available. Remotely checking my main PC at home, I see I've got nothing and no uploads for 20+ hours, but my WCGrid tasks have now filled the entire buffer. If an extra million tasks were magically made available, they'd all get wiped out too. Extra resources need to be permanently thrown at make-work to meet demand. Rosetta's a victim of its own success right now. Best not to throw it away - and not to piss people off in the process |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
OK -- point taken as to the source of the servers being under stress. We can agree that they are under stress though, and we can also agree that the automatic 24 hour back off (which again, I believe to be a server specific setting) is problematic in that in order to report work, manual intervention seems to be needed. Perhaps we can also agree that the last few months have been..... challenging for the project, largely because of the stress on the server. For me, at this point, I'm doing my part to reduce the stress on the server as I've set most of my workstations for 'no new work'. When life settles down, I will of course revert to my defaults there. Other than a single message that mentions Android when the project is unable to deliver work to your machine, why do you believe that these devices are sucking the life out of the servers? And if you believe that, why do you believe also say that it's a waste of time to support them as crunchers? Aren't there stats sites with accurate data somewhere that show the breakout of host machines by type? |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
OK -- point taken as to the source of the servers being under stress. We know server are VERY old so the solutions are two: - Change servers (with new HW, updated OS, etc). But admins seems to be NOT interested - Weekly maintenance, like "Seti@Home tuesday" |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Tasks, as they are created, are not for a specific platform. I misled myself at one point and implied otherwise in some posts recently. But someone reminded me that the task itself doesn't know what platform it will land on. BOINC just gets the proper application downloaded to the host and points it to the task files. I'm not sure why the message indicates android. I agree that the 24hr backoff is something that should be examined. It would make more sense to me that the backoff match the period of time it takes to typically get more work available. I also agree that new servers would be great, but we all know that it is not always a choice you are allowed to make. They've been talking about new or additional servers, but I'm sure there are many many hurdles to actually acquiring some. I hope we can all also agree that having many posts around the boards talking about such things with wild speculations about causes and hurling insults at the project team is not very constructive. I wouldn't expect anything to change while CASP is active. I know they are doing their best to keep the work flowing as smoothly as possible. I also know I can attach to backup projects, change my resource shares, increase my runtimes, and not stress about things I cannot control. Rosetta Moderator: Mod.Sense |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,524,889 RAC: 7,500 |
I hope we can all also agree that having many posts around the boards talking about such things with wild speculations about causes and hurling insults at the project team is not very constructive. I'm agree with you. But you've also to understand that we give you computational power for FREE. And we like if admins are more present on forum, with news, etc. A lot of questions are without answer: for example, the "server situation". On other thread, one of admins, said that they are thinking to change the hw, but the post was of 2 years ago!!! And after that, nothing more! P.S With less than 8k $ you can have, for example, a Dell server with 32 cores and 64 gb of ram. I think it's not a BIG price. |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
For all the talking heads on here complaining about lack of server upgrades, consider putting your money where your mouth is. I donate a couple hundred dollars per year to this project on top of the computing resources I contribute. Perhaps if more people donated a little money (it doesn't have to be hundreds of dollars, if everyone on these boards donated $10 it would add up quickly) perhaps it would afford the project the resources to do things like upgrade the servers, etc. Donation link is on the front page near the top-left, something to consider. I'll also mention that when you donate you are given a box to fill in what you want the money to be spent on. If the program suddenly got flooded with a hundred donations all with the message "for use for server upgrades to help with the 'project backoff' issues during peak load times" perhaps it would be more effective than being a talking head on a forum. **38 cores crunching for R@H on behalf of cancercomputer.org - a non-profit supporting High Performance Computing in Cancer Research |
Dr. Merkwürdigliebe Send message Joined: 5 Dec 10 Posts: 81 Credit: 2,657,273 RAC: 0 |
I never paid attention to that part of the site before...
That's exactly what I just did... |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
Thanks for that post -- I just did as you suggested and made a donation. I first started processing for this project nearly 10 years ago. My participation level varies as I tend to reset priorities for the various projects on a quarterly basis. About 10 months ago, I moved Rosetta up in my CPU processing priorities -- I like the research and I also like the long running stability of the project. The only downsides back then were the large downloads involved for processing work (tended to add to a squeeze in some of my SSD based systems), and what I perceived, right or wrong as a bit of standoffishness from the admin/moderator types. These days, the standoffishness no longer is an issue, but as I and others have noted there are some server issues out there. It seems those server issues are amplified by the major increase in participants the project has had. In the past year the user count has gone from 700,000 to over a million. So when I suggest that "I would do my part" to help by backing off my level of processing for Rosetta, that might have sounded a bit petulant. That wasn't my intent, rather it was a recognition that the project is dealing with the mixed blessing of a flood of new users. For me, the shift in priority to other projects is likely to be temporary, but quite easy to do. There are projects like World Grid and POEM which are doing good research work which I already have on my project processing list. In any event, I take the 'put your money where your keyboard is' comment as being spot on and have acted on that. Thanks for that suggestion. For all the talking heads on here complaining about lack of server upgrades, consider putting your money where your mouth is. |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
@ BarryAZ and Dr. Merkwürdigliebe - Thanks for being so awesome. When I wrote the post about donating I was half prepared to be heavily flamed by the 'but we are already donating our computing power!' crowd. If the project is in need of funds, be it for upgrading the server or perhaps hiring some help for exploring CPU optimizations or GPU ports or whatever, I think looking into running official crowd-funding campaigns may be a good way to go. 'Fundraising Thermometers' do wonders for helping projects like this hit a goal :) Let's keep the donations flowing, as my original comment said, you can put your thoughts about the need for server hardware upgrades in the 'what do you want this donation to be spent on' comment box that you get to fill out when donating, and I recon it will get a lot more official visibility from the project admins than posting aimlessly on this forum. :) **38 cores crunching for R@H on behalf of cancercomputer.org - a non-profit supporting High Performance Computing in Cancer Research |
Message boards :
Number crunching :
Credit granted pending stack
©2024 University of Washington
https://www.bakerlab.org