Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 33 · 34 · 35 · 36 · 37 · 38 · 39 . . . 296 · Next
Author | Message |
---|---|
rlpm Send message Joined: 23 Mar 20 Posts: 13 Credit: 84 RAC: 0 |
Signal 11 is SEGV (segmentation fault). This is typically due to a programming bug. Per stderr, looks like a few double frees as well, perhaps related. Anyone know how to report this to the boffins that write the software? Moderators? |
GLadi Send message Joined: 21 Jan 07 Posts: 3 Credit: 303,172 RAC: 0 |
-3 tasks in progress about 50% progress each It happened to me few days ago. For some WUs progress changed from let's say 50% to 100% immediately after resuming (even when BOINC Manager was switching between tasks from other projects). I haven't noticed it again. BTW I cannot see some of my tasks between 24-28 Mar 2020 in my account. |
koetjesreep Send message Joined: 24 Mar 20 Posts: 5 Credit: 495,994 RAC: 0 |
Signal 11 is SEGV (segmentation fault). This is typically due to a programming bug. Per stderr, looks like a few double frees as well, perhaps related. Anyone know how to report this to the boffins that write the software? Moderators? https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=92847#92847 says this thread is for problem reports, hence I posted it here :-) |
rlpm Send message Joined: 23 Mar 20 Posts: 13 Credit: 84 RAC: 0 |
Yep, good call. I'm also wondering if anyone on this thread has access to the code or know anyone who does and can hunt down this bug. |
Keith Myers Send message Joined: 29 Mar 20 Posts: 95 Credit: 303,014 RAC: 278 |
Yep, good call. I'm also wondering if anyone on this thread has access to the code or know anyone who does and can hunt down this bug. This thread seems to have the best explanation of the errors. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13658 Seems they are not parsing the cpu features correctly and attempting to run instructions that the cpu does not support. |
vowelmarauder Send message Joined: 22 Mar 20 Posts: 2 Credit: 2,114,237 RAC: 0 |
I just noticed that my tasks are taking almost twice as long as the ETA says. The time is either standing still with 1-2 seconds either way or counting *up*... I don't think I've tinkered with any settings and boinc is using all its cores fully. Is this normal? What's going on? https://i.imgur.com/3uwyfAU.jpg |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
We'll have to see one of 'em report back in to see for sure, but it sounds like you may have changed the Preference for the workunit runtime from the 8 hour default up to 12 or 24 hours. The watchdog will keep an eye on them for you if they run too long. I suggest letting them run to completion. Rosetta Moderator: Mod.Sense |
aad Send message Joined: 5 Jan 06 Posts: 9 Credit: 194,194,532 RAC: 279 |
So what's up with the credits for the task? I read it here somewhere here I think but can't find it. I see the same with my machines. It's only with the COVID wu's Maybe it's a virus ;-)) I still running them though..... |
robertmiles Send message Joined: 16 Jun 08 Posts: 1230 Credit: 14,175,352 RAC: 824 |
I just noticed that my tasks are taking almost twice as long as the ETA says. The time is either standing still with 1-2 seconds either way or counting *up*... I don't think I've tinkered with any settings and boinc is using all its cores fully. Is this normal? What's going on? You are a rather new user here. I've noticed that for each new version of any of the applications, about the first ten tasks on a computer using that version is likely to give a large mismatch between the expected time the task will run, and the time it actually runs. If all of your computers were connected since the last version change of each application, all of the versions in use are either new to your computers or recently have been. If the actual time is much larger than the initial expected time, it is normal for the expected time to completion to be going up instead of down. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1639 Credit: 16,802,283 RAC: 10,275 |
We'll have to see one of 'em report back in to see for sure, but it sounds like you may have changed the Preference for the workunit runtime from the 8 hour default up to 12 or 24 hours. The watchdog will keep an eye on them for you if they run too long. I suggest letting them run to completion.I've got the same thing occurring with my present Rosetta Mini v3.78 Tasks. I checked my preferences, and "Target CPU run time" is still "not selected." The current group of Tasks have been going for 12hr 20min with 3hr 45min estimated time to completion. Has the project's default "Target CPU run time" been changed with the newly released applications? (although the very few Rosetta v4.12 windows_x86_64 & Rosetta v4.12 windows_intelx86 processed Tasks i managed to pick up ran for the desired 8hrs. This seems to be affecting just the Rosetta Mini 3.78 applications; previous work i did with these applications ran to Target time OK). Grant Darwin NT |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
I've run a few 4.12 tasks overnight (around 80), first impressions are that it runs each decoy approx 4* as long, my preference is set for an hour, and my threadripper is quite quick, but it's taking between 3-4 hours to run one decoy if it starts with rb, under 4.07 I would hit 1 decoy in under an hour 99% of the time. The design task has run the same speed but only given 2 points credit per decoy. 4.12 is not looking productive from my end as an end user. It would take my slower processors in laptops etc nearly 8 hours or more for 1 decoy going by results by far. I've just been sent a further 60 tasks with rb prefix so ill see how they run. these are a different batch. |
pritpalb Send message Joined: 21 Mar 20 Posts: 2 Credit: 767,576 RAC: 0 |
https://imgur.com/a/gr9UJlr I am getting continuous "Scheduler request failed:HTTP gateway timeout" errors for the last week. This is on my Windows10 machine, while funnily enough my home imac is happily crunching and reporting tasks. I have tried clicking 'update' under the projects tab, 'reset' and i have even 'remove' the project and BOINC and reinstalled but the error persists. Also I am seeing more 'computation error' under the tasks tab. I know the scheduler is getting hammered but that doesnt explain why my other machine is working well through all this. Normal web browsing, reset project and downloading tasks works fine on the Windows 10 machine, but it just errors when reporting or requesting new tasks. Any ideas? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1639 Credit: 16,802,283 RAC: 10,275 |
I've just been sent a further 60 tasks with rb prefix so ill see how they run. these are a different batch.No new work here for over 12hrs now. My Rosetta Mini Tasks are on target for running twice a long as the Target time (16hrs instead of the default 8hrs). Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1639 Credit: 16,802,283 RAC: 10,275 |
https://imgur.com/a/gr9UJlrAre you running any 3rd party AV/ Anti-malware software? It wouldn't be the first time such a programme has taken exception to BOINC and the programmes that make use of it. Grant Darwin NT |
pritpalb Send message Joined: 21 Mar 20 Posts: 2 Credit: 767,576 RAC: 0 |
Are you running any 3rd party AV/ Anti-malware software? It wouldn't be the first time such a programme has taken exception to BOINC and the programmes that make use of it. Good thought. It would be unusual to allow some communication on port 80 but block reporting and requesting of new tasks? I am running webroot but it didnt cause a problem 2 weeks ago when I first started working on Rosetta@home. This timeout only started 1 week ago. I managed about 20 000 credit before the PC stopped reporting. The AV software is centrally managed by sysadmin so I cant place an exclusion on BOINC. ¯_(ツ)_/¯ |
Stephen "Heretic" Send message Joined: 2 Apr 20 Posts: 21 Credit: 11,028 RAC: 0 |
Hello, I have just joined this project but it seems there is no work to do at the moment. Is this a common state of affairs or have I struck a bad moment to join? Stephen ? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2086 Credit: 40,627,544 RAC: 4,020 |
I've run a few 4.12 tasks overnight (around 80), first impressions are that it runs each decoy approx 4* as long, my preference is set for an hour, and my threadripper is quite quick, but it's taking between 3-4 hours to run one decoy if it starts with rb, under 4.07 I would hit 1 decoy in under an hour 99% of the time. The design task has run the same speed but only given 2 points credit per decoy. You've just run 80 tasks overnight and received 60 more You have 2500 tasks available to run on your threadripper which has 64-cores and 132Gb RAM running 1hr tasks What is it about 4.12 that isn't looking productive from your pov? Because after several days with very few tasks available at all I'd kill for any of the problems you're currently having Quite astonishing. |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
I've run a few 4.12 tasks overnight (around 80), first impressions are that it runs each decoy approx 4* as long, my preference is set for an hour, and my threadripper is quite quick, but it's taking between 3-4 hours to run one decoy if it starts with rb, under 4.07 I would hit 1 decoy in under an hour 99% of the time. The design task has run the same speed but only given 2 points credit per decoy. The tasks in progress is incorrect, I reset the project twice this week due to multiple downloads failing so they arent really there as discussed in a different thread. I'm saying it doesn't look productive because the decoys are taking approximately 4 to 6 times longer to process. If you watch the graphics, it gets to a certain number of steps and then almost stops, taking 30-60 minutes for each additional step. Half last night before I went to bed stopped at step 24600, then took 30 mins to do step 24601 etc. So that's what I mean, it is taking 4-6 times longer to process the same work, so it appears. The latest batch which are rb 04 01 20235 19963 ab t000 robetta cstwt... Are currently on 2 hours 49, 56% on first decoy. Looks like 5hrs to run. 4.07 was running very similar tasks under an hour. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1639 Credit: 16,802,283 RAC: 10,275 |
Hello, I have just joined this project but it seems there is no work to do at the moment. Is this a common state of affairs or have I struck a bad moment to join??Work being done has increased by 500% over the last 2 and a bit weeks, so there's not much work available as demand is far exceeding supply. More work is meant to be coming, but apparently it takes quite a while to prepare it for release, so it will take a while before work production comes close to matching the present demand. Grant Darwin NT |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
I can give a little further info also, my cpu is currently 99% utilised. boinc is running 60 cores, 2 are running gpus for folding, 2 spare for overhead. normally when boinc is running with all cores running the clock speed is approx 3.2ghz, and it will pull as many watts as i let it (doubling the power with an overclock only get me to 3.55ghz) , at the moment it's pulling 15% less power, and the clock speed is up at 4.2ghz for all cores. If each core was being run hard it would be impossible for it to run this speed. This is the speed it normally runs with say 3 or 4 cores loaded. Imo 4.12 is not making use of the cpu properly, it's taking 4-6 times longer to complete a decoy which ties in with the fact my cpu is running a very high clock speed which indicates the cores are doing very little work. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org