Message boards : Number crunching : minirosetta 2.05
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
ofry Send message Joined: 21 Jan 10 Posts: 4 Credit: 164,475 RAC: 0 |
Hello :) I have a "antimeatball" with energy eq. 2K, 16K etc. Sample screenshot: This bug might have in tasks such "boinc_filtered_loopbuild_threading_" And this tasks usually does eq. 5 hours, but in my preferences "Target CPU run time" not selected (default 3 hours). P.S. Sorry for my English, I speak Russian. |
Admin Send message Joined: 13 Apr 07 Posts: 42 Credit: 260,782 RAC: 0 |
Just thought Id add to the post above mine. I can also confirm energy levels for t311 (same WU) have been sky high some values like 76053423 and RMSD running around 700. Thought it was strange so Id let you guys know. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
To help clarify, ofry has an anti-meatball. I don't see any problem in their screenshot though. Admin, ofry's screenshot is of protein t374, so if you are doing t311, then it is a different protein... although perhaps using the same methods to study it. Admin, how long would you see such high numbers? I'd think they'd settle down pretty quickly. I don't believe these are Sarel's new ones, so you can see why he's working on the approach that makes that initial 60-100 second survey of a given model and then moves on to something more promising much of the time. These proteins are very large, so when they are out of position and perhaps are nowhere near the natural conformation, the numbers can get pretty high... but 76m!? Rosetta Moderator: Mod.Sense |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,846,908 RAC: 11,988 |
Hello :) Таких я еще не видел (а вот скомкивание протеина в мячик - довольно часто). По русски я бы это назвал "взрыв на макаронной фабрике" :) А вообще не факт что это проблема, может просто одна из ранних стадий моделирования - т.к. изначально моделирования вообще начинается с протеина вытянутого в одну длинную "веревку". Причем в отличии от folding@home промежуточные этапы моделирования идут не точно (в соотвествии с тем, как это происходит в природе), а приблизительно и весьма хаотично. Так что промежуточные формы могут быть самыми причудливыми и далекими от оригинала. Это объясняется разными целями проектов - в фолдинге ученые хотят знать КАК протеин из цепочки сворачивается в свою естественную форму/структуру. А в Розетте - определять только конечную простанственную структуру протеина(или взаимодействия 2-х протеинов), по его известной "аминокислотной формуле", но зато делать это на порядки(в десятки и сотни раз) быстрее чем фолдинг, с его моделированием "в лоб" (на уровне отдельных атомов с шагом порядка 1 пикосекунды). А вот "мячик" (meatball) это проблема - т.к. там похоже какая-то ошибка, моделирование проскакивает естественную форму и начинает просто скомкивать белок в шар, все дальше уходя от оригинала (а не приближаясь к нему). |
ofry Send message Joined: 21 Jan 10 Posts: 4 Credit: 164,475 RAC: 0 |
Mad_Max Thanks for translate. Теперь на русском :) Та же проблема есть и на t303. (но только на типах задач boinc_filtered_loopbuild_threading_. На других этого нету) Я просто не давал аналогичные скрины. "translate" [quote] Admin, ofry's screenshot is of protein t374, so if you are doing t311, then it is a different protein... although perhaps using the same methods to study it. [quote] This problem is in many proteins, eq. t303 too. But in other methods (not boinc_filtered_loopbuild_threading_) I don't see this problem. Maybe, this method too bugged. |
ofry Send message Joined: 21 Jan 10 Posts: 4 Credit: 164,475 RAC: 0 |
https://boinc.bakerlab.org/rosetta/rah_results.php?BatchID=16901&SubBatchName=t364__boinc_filtered_loopbuild_threading_cst_relax_tex_&UserID=367430 # t364__boinc_filtered_loopbuild_threading_cst_relax_tex_ 5.126 3.418 5627.5 -384.673 17 14390 2010-01-25 My "best score" = 5627.5! |
Admin Send message Joined: 13 Apr 07 Posts: 42 Credit: 260,782 RAC: 0 |
Mod, I checked it last night, and it went though fine this morning but the values were defiantly very high either 7.6mill or 760k, the protein wasn't even in the window so I know it was a high value. Doesn't seem to be occurring with any other such WU right now, but ill keep an eye out. |
Mike Tyka Send message Joined: 20 Oct 05 Posts: 96 Credit: 2,190 RAC: 0 |
If it's the high energy of 4K you're worried about - that's not unusual when runs are submitted with constraints - looks all good to me .. Mike http://beautifulproteins.blogspot.com/ http://www.miketyka.com/ |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
Each of these cl1 WUs gave an error for both crunchers: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=285818853 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=285786792 |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
A couple of tasks that failed early on Mac OS X 10.6 with the same error SIGPIPE: write on a pipe with no reader 0 0x006e2839 SIGPIPE: write on a pipe with no reader cl1.1cc8.1cc8.IGNORE_THE_REST.c.0.20.pdb.pdb.JOB_17236_2_0 cl1.1s12.1s12.IGNORE_THE_REST.c.3.0.pdb.pdb.JOB_17313_1_0 |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
Add me to the list with these two that failed within 17 seconds cl1.1enh.1enh.IGNORE_THE_REST.c.2.32.pdb.pdb.JOB_17243_1 cl1.1enh.1enh.IGNORE_THE_REST.c.2.21.pdb.pdb.JOB_17243_1_0 |
Admin Send message Joined: 13 Apr 07 Posts: 42 Credit: 260,782 RAC: 0 |
Compute error occurred - Exit status -1073741819 (0xc0000005). Debug info is far too advanced for me to get any info from, so a team member will need to look at it. Occurred with cl1.2cmx.2cmx.IGNORE_THE_REST.c.0.25.pdb.pdb.JOB_17322_1_1. Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x006D2D46 read attempt to address 0x00000000 Link: https://boinc.bakerlab.org/rosetta/result.php?resultid=313485467. Wingman also received compute error with same WU. |
ofry Send message Joined: 21 Jan 10 Posts: 4 Credit: 164,475 RAC: 0 |
Compute errors in this WU's: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=285897731 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=286008240 ("Access Violation" type) (errors in 15-42 sec. CPU time) |
Admin Send message Joined: 13 Apr 07 Posts: 42 Credit: 260,782 RAC: 0 |
Unhandled Exception Error - cl1.1ail.1ail.IGNORE_THE_REST.c.4.22.pdb.pdb.JOB_17227_9_1 https://boinc.bakerlab.org/rosetta/result.php?resultid=313773510 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x006D2D46 read attempt to address 0x00000000 Debug Info in link.. failed at 21 seconds |
Mike_Solo Send message Joined: 16 Nov 09 Posts: 2 Credit: 67,261 RAC: 0 |
Looks like Mike Solo has 3 machines: Yes, sorry, missed the OS info. I introduced some Linux machines (Debian) instead of MS Win. All looks stabe under Linux. |
l_mckeon Send message Joined: 5 Jun 07 Posts: 44 Credit: 180,717 RAC: 0 |
"This app update includes a fix for checkpointing. Please report issues and bugs here!" I had a task yesterday that restarted from model one when the computer was switched off then on again. The task had been saving checkpoints. The upload was >200kB and the task ended a few minutes after restarting. I don't know the task number but I'm fairly sure it was an lr5 task. |
Admin Send message Joined: 13 Apr 07 Posts: 42 Credit: 260,782 RAC: 0 |
Unhandled Exception Error - cl1.1ail.1ail.IGNORE_THE_REST.c.6.3.pdb.pdb.JOB_17227_4_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=313910991 WU froze at 61% complete. I found it and had to abort it after 10 hours of run time. - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x75E31AF3 Debug info in link as usual |
SFCC Send message Joined: 3 Sep 09 Posts: 10 Credit: 227,659 RAC: 0 |
I am running Rosetta@home (along with some other projects) on four machines, three runing XP and one running Win7. Two of the machines have AMD 64 single core processors and two have AMD 64 dual core processors. The rosetta app is 'rosetta mini 2.05' and the BOINC version is 6.10.18. About once a week or so I will notice that the Rosetta WU running on at least one of the machines has been running longer than usual and using little or no CPU time. I abort it, and the next WU runs fine. I have not seen this occur with any of my other projects (climite prodiction, malaria control or world community grid). |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
Try suspending/resuming, or even entirely turn-off/restart BOINC, before aborting the WU. I have found this works often, not always, for me. [/quote] Even if this works, it shouldn't be necessary to babysit BOINC/Rosetta in this way. This hanging certainly seems to be a widespread issue but one that only affects Windows in its various incarnations. The fact that it's irreproducible means a fix may be some time in coming but I hope the project team find it soon. |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,846,908 RAC: 11,988 |
Some tasks (for exsample type of abinitio_relax_homfrag_natfrag ....) makes a lot of steps up to 200000 - 400000 for 1 model. Is this normal? And on another type of job (critStubs_profiled_1dnA_...) one of them was abnormally small credits (1.97 cr and 9 decoys for 7k CPU seconds): https://boinc.bakerlab.org/rosetta/result.php?resultid=314040179 While other WUs of the same type considered normal (about ~20 cr and ~80 decoys for the same CPU time). This is a bug with partial loss of the results of calculations? Or in this type of tasks just is uneven crediting? (As in the tasks type *gbnnotyr*, where the combined "small" and "huge" models in the same type of tasks) |
Message boards :
Number crunching :
minirosetta 2.05
©2024 University of Washington
https://www.bakerlab.org