minirosetta 2.05

Message boards : Number crunching : minirosetta 2.05

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next

AuthorMessage
ofry

Send message
Joined: 21 Jan 10
Posts: 4
Credit: 164,475
RAC: 0
Message 65103 - Posted: 25 Jan 2010, 16:17:19 UTC

Hello :)

I have a "antimeatball" with energy eq. 2K, 16K etc.
Sample screenshot:


This bug might have in tasks such "boinc_filtered_loopbuild_threading_"

And this tasks usually does eq. 5 hours, but in my preferences "Target CPU run time" not selected (default 3 hours).

P.S. Sorry for my English, I speak Russian.

ID: 65103 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin

Send message
Joined: 13 Apr 07
Posts: 42
Credit: 260,782
RAC: 0
Message 65104 - Posted: 25 Jan 2010, 16:29:02 UTC

Just thought Id add to the post above mine. I can also confirm energy levels for t311 (same WU) have been sky high some values like 76053423 and RMSD running around 700. Thought it was strange so Id let you guys know.
ID: 65104 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 65106 - Posted: 25 Jan 2010, 17:08:09 UTC

To help clarify, ofry has an anti-meatball. I don't see any problem in their screenshot though.

Admin, ofry's screenshot is of protein t374, so if you are doing t311, then it is a different protein... although perhaps using the same methods to study it.

Admin, how long would you see such high numbers? I'd think they'd settle down pretty quickly.

I don't believe these are Sarel's new ones, so you can see why he's working on the approach that makes that initial 60-100 second survey of a given model and then moves on to something more promising much of the time.

These proteins are very large, so when they are out of position and perhaps are nowhere near the natural conformation, the numbers can get pretty high... but 76m!?
Rosetta Moderator: Mod.Sense
ID: 65106 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 25,846,908
RAC: 11,988
Message 65108 - Posted: 25 Jan 2010, 17:47:16 UTC - in response to Message 65103.  

Hello :)

I have a "antimeatball" with energy eq. 2K, 16K etc.
Sample screenshot:
This bug might have in tasks such "boinc_filtered_loopbuild_threading_"

And this tasks usually does eq. 5 hours, but in my preferences "Target CPU run time" not selected (default 3 hours).
P.S. Sorry for my English, I speak Russian.


Таких я еще не видел (а вот скомкивание протеина в мячик - довольно часто).
По русски я бы это назвал "взрыв на макаронной фабрике" :)
А вообще не факт что это проблема, может просто одна из ранних стадий моделирования - т.к. изначально моделирования вообще начинается с протеина вытянутого в одну длинную "веревку". Причем в отличии от folding@home промежуточные этапы моделирования идут не точно (в соотвествии с тем, как это происходит в природе), а приблизительно и весьма хаотично. Так что промежуточные формы могут быть самыми причудливыми и далекими от оригинала.
Это объясняется разными целями проектов - в фолдинге ученые хотят знать КАК протеин из цепочки сворачивается в свою естественную форму/структуру. А в Розетте - определять только конечную простанственную структуру протеина(или взаимодействия 2-х протеинов), по его известной "аминокислотной формуле", но зато делать это на порядки(в десятки и сотни раз) быстрее чем фолдинг, с его моделированием "в лоб" (на уровне отдельных атомов с шагом порядка 1 пикосекунды).

А вот "мячик" (meatball) это проблема - т.к. там похоже какая-то ошибка, моделирование проскакивает естественную форму и начинает просто скомкивать белок в шар, все дальше уходя от оригинала (а не приближаясь к нему).
ID: 65108 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ofry

Send message
Joined: 21 Jan 10
Posts: 4
Credit: 164,475
RAC: 0
Message 65109 - Posted: 25 Jan 2010, 17:58:21 UTC

Mad_Max Thanks for translate.

Теперь на русском :)

Та же проблема есть и на t303. (но только на типах задач boinc_filtered_loopbuild_threading_. На других этого нету) Я просто не давал аналогичные скрины.

"translate"

[quote]
Admin, ofry's screenshot is of protein t374, so if you are doing t311, then it is a different protein... although perhaps using the same methods to study it.
[quote]

This problem is in many proteins, eq. t303 too. But in other methods (not boinc_filtered_loopbuild_threading_) I don't see this problem. Maybe, this method too bugged.
ID: 65109 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ofry

Send message
Joined: 21 Jan 10
Posts: 4
Credit: 164,475
RAC: 0
Message 65110 - Posted: 25 Jan 2010, 18:09:50 UTC

https://boinc.bakerlab.org/rosetta/rah_results.php?BatchID=16901&SubBatchName=t364__boinc_filtered_loopbuild_threading_cst_relax_tex_&UserID=367430

# t364__boinc_filtered_loopbuild_threading_cst_relax_tex_
5.126 3.418 5627.5 -384.673 17 14390 2010-01-25

My "best score" = 5627.5!

ID: 65110 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin

Send message
Joined: 13 Apr 07
Posts: 42
Credit: 260,782
RAC: 0
Message 65111 - Posted: 25 Jan 2010, 18:14:17 UTC

Mod,

I checked it last night, and it went though fine this morning but the values were defiantly very high either 7.6mill or 760k, the protein wasn't even in the window so I know it was a high value. Doesn't seem to be occurring with any other such WU right now, but ill keep an eye out.
ID: 65111 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Tyka

Send message
Joined: 20 Oct 05
Posts: 96
Credit: 2,190
RAC: 0
Message 65113 - Posted: 26 Jan 2010, 0:36:08 UTC - in response to Message 65111.  

If it's the high energy of 4K you're worried about - that's not unusual when runs are submitted with constraints - looks all good to me ..

Mike

http://beautifulproteins.blogspot.com/
http://www.miketyka.com/
ID: 65113 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 65118 - Posted: 26 Jan 2010, 16:40:16 UTC

ID: 65118 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 65121 - Posted: 26 Jan 2010, 17:44:37 UTC

A couple of tasks that failed early on Mac OS X 10.6 with the same error

SIGPIPE: write on a pipe with no reader
0 0x006e2839 SIGPIPE: write on a pipe with no reader


cl1.1cc8.1cc8.IGNORE_THE_REST.c.0.20.pdb.pdb.JOB_17236_2_0
cl1.1s12.1s12.IGNORE_THE_REST.c.3.0.pdb.pdb.JOB_17313_1_0
ID: 65121 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 65122 - Posted: 26 Jan 2010, 23:17:21 UTC

ID: 65122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin

Send message
Joined: 13 Apr 07
Posts: 42
Credit: 260,782
RAC: 0
Message 65124 - Posted: 27 Jan 2010, 0:05:13 UTC

Compute error occurred - Exit status -1073741819 (0xc0000005). Debug info is far too advanced for me to get any info from, so a team member will need to look at it. Occurred with cl1.2cmx.2cmx.IGNORE_THE_REST.c.0.25.pdb.pdb.JOB_17322_1_1.

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x006D2D46 read attempt to address 0x00000000

Link: https://boinc.bakerlab.org/rosetta/result.php?resultid=313485467.

Wingman also received compute error with same WU.
ID: 65124 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ofry

Send message
Joined: 21 Jan 10
Posts: 4
Credit: 164,475
RAC: 0
Message 65131 - Posted: 27 Jan 2010, 20:03:11 UTC
Last modified: 27 Jan 2010, 20:05:28 UTC

Compute errors in this WU's:

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=285897731
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=286008240

("Access Violation" type)

(errors in 15-42 sec. CPU time)
ID: 65131 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin

Send message
Joined: 13 Apr 07
Posts: 42
Credit: 260,782
RAC: 0
Message 65132 - Posted: 27 Jan 2010, 20:58:43 UTC

Unhandled Exception Error - cl1.1ail.1ail.IGNORE_THE_REST.c.4.22.pdb.pdb.JOB_17227_9_1

https://boinc.bakerlab.org/rosetta/result.php?resultid=313773510

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x006D2D46 read attempt to address 0x00000000

Debug Info in link.. failed at 21 seconds
ID: 65132 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike_Solo

Send message
Joined: 16 Nov 09
Posts: 2
Credit: 67,261
RAC: 0
Message 65136 - Posted: 28 Jan 2010, 10:01:56 UTC - in response to Message 65015.  

Looks like Mike Solo has 3 machines:
One WinXP using BOINC version 6.10.18
One WinXP using BOINC version 6.10.18
One WinServer 2003 using BOINC version 6.10.18

Yes, sorry, missed the OS info.
I introduced some Linux machines (Debian) instead of MS Win.
All looks stabe under Linux.
ID: 65136 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
l_mckeon

Send message
Joined: 5 Jun 07
Posts: 44
Credit: 180,717
RAC: 0
Message 65147 - Posted: 30 Jan 2010, 0:01:05 UTC

"This app update includes a fix for checkpointing.

Please report issues and bugs here!"

I had a task yesterday that restarted from model one when the computer was switched off then on again.

The task had been saving checkpoints.

The upload was >200kB and the task ended a few minutes after restarting. I don't know the task number but I'm fairly sure it was an lr5 task.
ID: 65147 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin

Send message
Joined: 13 Apr 07
Posts: 42
Credit: 260,782
RAC: 0
Message 65148 - Posted: 30 Jan 2010, 3:03:01 UTC

Unhandled Exception Error - cl1.1ail.1ail.IGNORE_THE_REST.c.6.3.pdb.pdb.JOB_17227_4_0
https://boinc.bakerlab.org/rosetta/result.php?resultid=313910991

WU froze at 61% complete. I found it and had to abort it after 10 hours of run time.

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x75E31AF3

Debug info in link as usual
ID: 65148 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SFCC

Send message
Joined: 3 Sep 09
Posts: 10
Credit: 227,659
RAC: 0
Message 65149 - Posted: 30 Jan 2010, 4:27:05 UTC

I am running Rosetta@home (along with some other projects) on four machines, three runing XP and one running Win7. Two of the machines have AMD 64 single core processors and two have AMD 64 dual core processors. The rosetta app is 'rosetta mini 2.05' and the BOINC version is 6.10.18. About once a week or so I will notice that the Rosetta WU running on at least one of the machines has been running longer than usual and using little or no CPU time. I abort it, and the next WU runs fine. I have not seen this occur with any of my other projects (climite prodiction, malaria control or world community grid).
ID: 65149 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 65155 - Posted: 30 Jan 2010, 21:38:29 UTC - in response to Message 65152.  


Try suspending/resuming, or even entirely turn-off/restart BOINC, before aborting the WU. I have found this works often, not always, for me.
[/quote]

Even if this works, it shouldn't be necessary to babysit BOINC/Rosetta in this way. This hanging certainly seems to be a widespread issue but one that only affects Windows in its various incarnations. The fact that it's irreproducible means a fix may be some time in coming but I hope the project team find it soon.
ID: 65155 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 25,846,908
RAC: 11,988
Message 65157 - Posted: 31 Jan 2010, 1:14:32 UTC
Last modified: 31 Jan 2010, 1:17:39 UTC

Some tasks (for exsample type of abinitio_relax_homfrag_natfrag ....) makes a lot of steps up to 200000 - 400000 for 1 model. Is this normal?

And on another type of job (critStubs_profiled_1dnA_...) one of them was abnormally small credits (1.97 cr and 9 decoys for 7k CPU seconds): https://boinc.bakerlab.org/rosetta/result.php?resultid=314040179
While other WUs of the same type considered normal (about ~20 cr and ~80 decoys for the same CPU time).
This is a bug with partial loss of the results of calculations? Or in this type of tasks just is uneven crediting? (As in the tasks type *gbnnotyr*, where the combined "small" and "huge" models in the same type of tasks)
ID: 65157 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next

Message boards : Number crunching : minirosetta 2.05



©2024 University of Washington
https://www.bakerlab.org