Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 294 · 295 · 296 · 297
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1652 Credit: 17,225,462 RAC: 22,533 |
And the boinc-process host is down again. Grant Darwin NT |
tgbauer Send message Joined: 5 Jan 06 Posts: 7 Credit: 99,387,903 RAC: 75,781 |
Have a work unit that doesn't seem to be getting as far as others, and has an unusually long model (the graphics shows a dot with a line that seems to go on into infinity) Other Tasks are running as expected.
This is stderr.txt command: rosetta_4.20_x86_64-apple-darwin -run:protocol jd2_scripting @flags_rb_09_09_632102_625918__t000__0_C1_robetta -silent_gz -mute all -out:file:silent default.out -in:file:boinc_wu_zip input_rb_09_09_632102_625918__t000__0_C1_robetta.zip -frag_weight_aligned 0.5 -max_registry_shift 4 -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3499362 Using database: database_357d5d93529_n_methyl/minirosetta_database error: zipfile probably corrupt (segmentation violation) error: zipfile probably corrupt (illegal instruction) BOINC:: CPU time: 64841.5s, 36000s + 28800s[2024-10-21 22:25: 9:] :: BOINC Output exists: default.out.gz Size: WARNING! cannot get file size for default.out.gz: could not open file. -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 error: zipfile probably corrupt (segmentation violation) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2097 Credit: 40,822,968 RAC: 13,021 |
Have a work unit that doesn't seem to be getting as far as others, and has an unusually long model (the graphics shows a dot with a line that seems to go on into infinity) It's probably already errored out by now, but with all those errors and running over 2.5days without starting, you should abort it if it's still going. It hasn't started, let alone stand any chance of finishing. Let your core have something more productive to run. |
tgbauer Send message Joined: 5 Jan 06 Posts: 7 Credit: 99,387,903 RAC: 75,781 |
Fortunately this seems to be a one-off and other tasks are processing as expected. Restarting bionic client caused it to realize it needed to error out this task. Maybe at some point bionic client will recognize similar errors (for any project) and avoid a restart or abort |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1652 Credit: 17,225,462 RAC: 22,533 |
And the boinc-process host is down again.Still dead, so still no work being Validated. Grant Darwin NT |
tgbauer Send message Joined: 5 Jan 06 Posts: 7 Credit: 99,387,903 RAC: 75,781 |
Looks like Application "Rosetta Beta 6.06" tasks are using 2.5GB of RAM each! That becomes a bit inefficient when have 128 cores in a computer and 128GB RAM (only 46/128 cores used). Ones before that and "Rosetta 4.20" are consuming less than 0.5GB (and all 128 cores used). The recent beta 6.06 tasks are now using less than 1GB (600MB compressed). Thank you for fixing the RAM size! Now I'm able to use all cores again |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 32 Credit: 31,619,624 RAC: 35,124 |
It appears that they (whoever they are) have resolved the massive memory gobbling. Do you think I would be wise to remove the limitation on the beta runs? I currently have it limited to only 6 per computer. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2097 Credit: 40,822,968 RAC: 13,021 |
I think so. It's possible it ran short of RAM as some tasks are demanding high amounts recently, but better to think of it as a one-off and just move on. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2097 Credit: 40,822,968 RAC: 13,021 |
And the boinc-process host is down again.Still dead, so still no work being Validated. It came back about 8hrs ago. Everything nearly cleared down now. And some tasks became available, but have all been gobbled up again. All very hand-to-mouth |
Matthew Tireman Send message Joined: 24 Mar 20 Posts: 6 Credit: 387,215 RAC: 11,148 |
:/ |
Matthew Tireman Send message Joined: 24 Mar 20 Posts: 6 Credit: 387,215 RAC: 11,148 |
One of my systems (phenom ii x6 1065t) fails all Rosetta BETA 6 tasks yet is fine with Rosetta 4 tasks. It almost immediately fails the tasks. Ive: Reinstalled boinc Enabled virtiualization Reinstalled virtualbox twice If this isn't solveable then is it possible to disable Rosetta 6 beta tasks specifically on this machine? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1231 Credit: 14,219,712 RAC: 3,297 |
One of my systems (phenom ii x6 1065t) fails all Rosetta BETA 6 tasks yet is fine with Rosetta 4 tasks. I tried to look up which of your systems that is in order to see if I could help. The information I found by clicking on your author name did not include the system type (phenom ii), only items like the CPU and GPU types, so I couldn't help. 9phenom ii |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2097 Credit: 40,822,968 RAC: 13,021 |
One of my systems (phenom ii x6 1065t) fails all Rosetta BETA 6 tasks yet is fine with Rosetta 4 tasks. It looks to be this one. I can't help either. Some tasks crashed with their wingman too, but others completed fully and successfully. The only thing I might ask about is if that PC is overclocked or old and maybe overheating. Might it need a clean-out of dust from fans and vents in order to run cooler? Can't do any harm. But I'm guessing - I have no idea what's wrong. And there's no way to disable Rosetta Beta tasks only. If Matthew doesn't mind the wasted bandwidth, let them crash out in a few seconds and someone else will have a go at them while he moves on with other tasks that do run successfully. |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 32 Credit: 31,619,624 RAC: 35,124 |
[ Ahh...but there is! At least under linux. Thanks to the beta jobs asking for 2+GB of memory I took the hint(s) and restricted them. But they've fixed that problem so it turned into a "learning experience" and I'm limiting the number of einstein@home jobs now. Details available via private message if anyone is interested in how. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1652 Credit: 17,225,462 RAC: 22,533 |
The only thing to try that comes to mind is to reset the Project. If one of the data files needed for Beta Tasks has become corrupted, that can cause the problem you're experiencing. Resetting the project will release all downloaded work, and clear out all existing application & database files & re-download them from the project from scratch. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1652 Credit: 17,225,462 RAC: 22,533 |
boinc-process host has died yet again... Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1985 Credit: 9,362,147 RAC: 8,863 |
boinc-process host has died yet again... I missed it a little |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org