exited with zero status but no 'finished' file

Message boards : Number crunching : exited with zero status but no 'finished' file

To post messages, you must log in.

AuthorMessage
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 74182 - Posted: 5 Nov 2012, 18:31:33 UTC

Like others, I'm seeing messages in the event log (Mac OS X 10.6.8/Boinc 7.0.31) reporting this error:

exited with zero status but no 'finished' file

I happens on a machine that is on 24/7 so I don't think it's a hibernate/sleep issue.

Sample output


Sat Nov 3 23:17:29 2012 | rosetta@home | Scheduler request completed: got 0 new tasks
Sat Nov 3 23:19:04 2012 | rosetta@home | Finished download of input_hyb_al_02_bench_3slkB_yfsong.zip
Sat Nov 3 23:19:47 2012 | | Suspending network activity - user request
Sun Nov 4 02:53:03 2012 | rosetta@home | Computation for task Ploop4_2_abinitio_design_y465_009_60334_1680_0 finished
Sun Nov 4 02:53:20 2012 | rosetta@home | Starting task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_05_62798_11_0 using minirosetta version 341 in slot 1
Sun Nov 4 02:55:19 2012 | rosetta@home | Task hyb_al_08_bench_3slkB_SAVE_ALL_OUT_IGNORE_THE_REST_60945_2133_0 exited with zero status but no 'finished' file
Sun Nov 4 02:55:19 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sun Nov 4 02:55:19 2012 | rosetta@home | Restarting task hyb_al_08_bench_3slkB_SAVE_ALL_OUT_IGNORE_THE_REST_60945_2133_0 using minirosetta version 341 in slot 0
Sun Nov 4 08:36:36 2012 | rosetta@home | Computation for task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_05_62798_11_0 finished
Sun Nov 4 08:36:48 2012 | rosetta@home | Starting task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_10_03_62798_11_0 using minirosetta version 341 in slot 1
Sun Nov 4 08:40:53 2012 | rosetta@home | Task hyb_al_08_bench_3slkB_SAVE_ALL_OUT_IGNORE_THE_REST_60945_2133_0 exited with zero status but no 'finished' file
Sun Nov 4 08:40:53 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sun Nov 4 08:40:53 2012 | rosetta@home | Restarting task hyb_al_08_bench_3slkB_SAVE_ALL_OUT_IGNORE_THE_REST_60945_2133_0 using minirosetta version 341 in slot 0
Sun Nov 4 08:42:18 2012 | rosetta@home | work fetch suspended by user
Sun Nov 4 08:42:56 2012 | rosetta@home | task hyb_al_08_bench_3slkB_SAVE_ALL_OUT_IGNORE_THE_REST_60945_2133_0 aborted by user
Sun Nov 4 08:42:57 2012 | rosetta@home | Starting task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_07_62798_7_0 using minirosetta version 341 in slot 2
Sun Nov 4 08:43:39 2012 | rosetta@home | Computation for task hyb_al_08_bench_3slkB_SAVE_ALL_OUT_IGNORE_THE_REST_60945_2133_0 finished
Sun Nov 4 08:44:15 2012 | rosetta@home | Task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_07_62798_7_0 exited with zero status but no 'finished' file
Sun Nov 4 08:44:15 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sun Nov 4 08:44:15 2012 | rosetta@home | Restarting task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_07_62798_7_0 using minirosetta version 341 in slot 2
Sun Nov 4 08:44:17 2012 | rosetta@home | Task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_10_03_62798_11_0 exited with zero status but no 'finished' file
Sun Nov 4 08:44:17 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sun Nov 4 08:44:17 2012 | rosetta@home | Restarting task rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_10_03_62798_11_0 using minirosetta version 341 in slot 1
Sun Nov 4 08:44:20 2012 | | Resuming network activity
Sun Nov 4 08:44:20 2012 | rosetta@home | Started upload of Ploop4_2_abinitio_design_y465_009_60334_1680_0_0
Sun Nov 4 08:44:20 2012 | rosetta@home | Started upload of rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_05_62798_11_0_0
Sun Nov 4 08:44:25 2012 | rosetta@home | Finished upload of Ploop4_2_abinitio_design_y465_009_60334_1680_0_0
Sun Nov 4 08:44:27 2012 | rosetta@home | Finished upload of rb_11_03_30323_64727_h001__sp1_IGNORE_THE_REST_08_05_62798_11_0_0
Sun Nov 4 08:44:31 2012 | rosetta@home | Sending scheduler request: To report completed tasks.
Sun Nov 4 08:44:31 2012 | rosetta@home | Reporting 3 completed tasks
Sun Nov 4 08:44:31 2012 | rosetta@home | Not requesting tasks: scheduler RPC backoff
Sun Nov 4 08:44:35 2012 | rosetta@home | Scheduler request completed

ID: 74182 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 74185 - Posted: 6 Nov 2012, 11:12:51 UTC

Most recently discussed here


Best,
Snags
ID: 74185 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,135,082
RAC: 4,703
Message 74186 - Posted: 6 Nov 2012, 13:01:21 UTC - in response to Message 74185.  

Most recently discussed here

Best,
Snags


Svincent was TOLD to start a new thread by Mod Sense.
ID: 74186 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 74197 - Posted: 7 Nov 2012, 22:25:12 UTC - in response to Message 74186.  

Most recently discussed here

Best,
Snags


Svincent was TOLD to start a new thread by Mod Sense.



?

I assumed Mod.Sense suggested a new thread because svincent originally posted in the "Current issues with 7+ BOINC client" thread. I made a link to a different, slightly older thread (with the exact same title as this thread, "exited with zero status but no 'finished' file") simply because I didn't have time to summarize it.

I will repost the link to the BOINC FAQ Service page which describes this long standing (since BOINC 5+) error message and the possible causes and solutions.

If svincent or googloo (from the previous thread) still think it's related to the 7+ BOINC client then it would be most helpful if they post back detailing how they eliminated the other triggers.

Best,
Snags
ID: 74197 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 74199 - Posted: 7 Nov 2012, 23:53:33 UTC

Sorry: missed that thread.
ID: 74199 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 74203 - Posted: 8 Nov 2012, 4:58:55 UTC

Interesting. As you say, you'd think a machine that's always active wouldn't have such problems. Can you think of any other activity on the machine that might cause all of the tasks to encounter the error at the same time like that? Or is that NOT all of the active tasks?
Rosetta Moderator: Mod.Sense
ID: 74203 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
gazzawazza

Send message
Joined: 4 May 07
Posts: 28
Credit: 297,648
RAC: 0
Message 74243 - Posted: 10 Nov 2012, 17:20:40 UTC

hi all

I've just realised that a task I'd been crunching for a while (I'd got probably 2-3 hours elapsed on it and I think about 30% completed) had just randomly restarted itself.

However, I'm not sure which thread to post to on this matter.

My platform's vista 32 bit.

Here's an extract from my event log from today. I've filtered just the Rosetta entries:

10/11/2012 11:59:15 | | No config file found - using defaults
10/11/2012 11:59:16 | | Starting BOINC client version 7.0.28 for windows_intelx86
10/11/2012 11:59:16 | | log flags: file_xfer, sched_ops, task
10/11/2012 11:59:16 | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
10/11/2012 11:59:16 | | Data directory: C:ProgramDataBOINC
10/11/2012 11:59:16 | | Running under account Gary
10/11/2012 11:59:16 | | Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz [Family 6 Model 15 Stepping 11]
10/11/2012 11:59:16 | | Processor: 4.00 MB cache
10/11/2012 11:59:16 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 nx lm vmx tm2 pbe
10/11/2012 11:59:16 | | OS: Microsoft Windows Vista: Home Premium x86 Edition, Service Pack 2, (06.00.6002.00)
10/11/2012 11:59:16 | | Memory: 3.12 GB physical, 7.70 GB virtual
10/11/2012 11:59:16 | | Disk: 298.09 GB total, 7.31 GB free
10/11/2012 11:59:16 | | Local time is UTC +0 hours
10/11/2012 11:59:16 | | NVIDIA GPU 0: GeForce GTX 570 (driver version 306.97, CUDA version 5.0, compute capability 2.0, 1280MB, 1169MB available, 1405 GFLOPS peak)
10/11/2012 11:59:16 | | OpenCL: NVIDIA GPU 0: GeForce GTX 570 (driver version 306.97, device version OpenCL 1.1 CUDA, 1280MB, 1169MB available)
10/11/2012 11:59:16 | rosetta@home | URL https://boinc.bakerlab.org/rosetta/; Computer ID 1112807; resource share 100
10/11/2012 11:59:16 | | Reading preferences override file
10/11/2012 11:59:16 | | Preferences:
10/11/2012 11:59:16 | | max memory usage when active: 2398.19MB
10/11/2012 11:59:16 | | max memory usage when idle: 2398.19MB
10/11/2012 11:59:16 | | max disk usage: 5.00GB
10/11/2012 11:59:16 | | don't compute while active
10/11/2012 11:59:16 | | don't use GPU while active
10/11/2012 11:59:16 | | suspend work if non-BOINC CPU load exceeds 50 %
10/11/2012 11:59:16 | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
10/11/2012 11:59:16 | | Not using a proxy
10/11/2012 12:00:12 | rosetta@home | Restarting task hyb_ac_bench_3rdeD_20_SAVE_ALL_OUT_IGNORE_THE_REST_54744_117_0 using minirosetta version 341 in slot 2
10/11/2012 12:00:12 | rosetta@home | Sending scheduler request: To fetch work.
10/11/2012 12:00:12 | rosetta@home | Requesting new tasks for NVIDIA
10/11/2012 12:01:15 | rosetta@home | Scheduler request completed: got 0 new tasks
10/11/2012 12:06:49 | | Project communication failed: attempting access to reference site
10/11/2012 12:06:51 | | Internet access OK - project servers may be temporarily down.
10/11/2012 12:06:51 | rosetta@home | Sending scheduler request: To fetch work.
10/11/2012 12:06:51 | rosetta@home | Requesting new tasks for NVIDIA
10/11/2012 12:06:53 | rosetta@home | Scheduler request completed: got 0 new tasks
10/11/2012 12:25:54 | rosetta@home | Sending scheduler request: To fetch work.
10/11/2012 12:25:54 | rosetta@home | Requesting new tasks for NVIDIA
10/11/2012 12:25:56 | rosetta@home | Scheduler request completed: got 0 new tasks
10/11/2012 12:46:51 | rosetta@home | Sending scheduler request: To fetch work.
10/11/2012 12:46:51 | rosetta@home | Requesting new tasks for NVIDIA
10/11/2012 12:46:52 | rosetta@home | Scheduler request completed: got 0 new tasks
10/11/2012 13:56:02 | rosetta@home | Sending scheduler request: To fetch work.
10/11/2012 13:56:02 | rosetta@home | Requesting new tasks for NVIDIA
10/11/2012 13:56:04 | rosetta@home | Scheduler request completed: got 0 new tasks
10/11/2012 15:14:50 | | Project communication failed: attempting access to reference site
10/11/2012 15:14:52 | | Internet access OK - project servers may be temporarily down.
10/11/2012 16:44:41 | rosetta@home | Task hyb_ac_bench_3rdeD_20_SAVE_ALL_OUT_IGNORE_THE_REST_54744_117_0 exited with zero status but no 'finished' file
10/11/2012 16:44:41 | rosetta@home | If this happens repeatedly you may need to reset the project.
10/11/2012 16:44:41 | rosetta@home | Restarting task hyb_ac_bench_3rdeD_20_SAVE_ALL_OUT_IGNORE_THE_REST_54744_117_0 using minirosetta version 341 in slot 2


Please note though that in the unfiltered log, the entry immediately prior to the "... exited with zero status" statement (at 16:44:41) was at 15:40:02 I.e. 55 minutes earlier. So, there's no recorded event to give us a clue as to why this task "exited with zero status but no 'finished' file".

I suspect as well, in hindsight, this isn't the first time the task has restarted (see last entry in log), since I'd swear the task has been 'crunched' for some time and I feel, whenever I've glanced at it's level of completion, it's typically been around 30%.

Hope that makes sense.

The log states that I might need to reset the project, if this keeps happening. Do you think this is appropriate action or would you like me to do any diagnostic work for you?



Regards,

Gary
ID: 74243 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 74247 - Posted: 11 Nov 2012, 2:45:26 UTC - in response to Message 74203.  

Interesting. As you say, you'd think a machine that's always active wouldn't have such problems. Can you think of any other activity on the machine that might cause all of the tasks to encounter the error at the same time like that? Or is that NOT all of the active tasks?


These tasks are run on an oldish Mac Mini that I no longer use for other work. This computer runs 24/7 on R@h (6-hour time preferences) only and is only connected to the Internet once a week. So it's unlikely to be related to other tasks.

However, it does seem to be related to this error message:

No heartbeat from core client for 30 sec - exiting

Here, from the error log, are messages related to a task that successfully completed but restarted 3 times.

Fri Nov 9 23:13:14 2012 | rosetta@home | Starting task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 using minirosetta version 341 in slot 1
Sat Nov 10 00:16:46 2012 | rosetta@home | Task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 exited with zero status but no 'finished' file
Sat Nov 10 00:16:46 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sat Nov 10 00:16:46 2012 | rosetta@home | Restarting task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 using minirosetta version 341 in slot 1
at Nov 10 00:25:07 2012 | rosetta@home | Task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 exited with zero status but no 'finished' file
Sat Nov 10 00:25:07 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sat Nov 10 00:25:07 2012 | rosetta@home | Restarting task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 using minirosetta version 341 in slot 1
Sat Nov 10 00:30:49 2012 | rosetta@home | Task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 exited with zero status but no 'finished' file
Sat Nov 10 00:30:49 2012 | rosetta@home | If this happens repeatedly you may need to reset the project.
Sat Nov 10 00:30:49 2012 | rosetta@home | Restarting task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 using minirosetta version 341 in slot 1
Sat Nov 10 05:28:49 2012 | rosetta@home | Computation for task Rossmann3x3_abinitio_SAVE_ALL_OUT_design_k062_007_62403_1239_1 finished

Digging through the Rosetta results, it turned out that this was task 541753482.

In the stderr out section, and with times corresponding to the messages as recorded in the error log above, are 3 messages:

No heartbeat from core client for 30 sec - exiting

In two out of the three cases this message is followed by another error message:

FILE_LOCK::unlock(): close failed.: Bad file descriptor

Hope this helps.


ID: 74247 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 74548 - Posted: 24 Nov 2012, 17:15:29 UTC

This task 544147042 behaved in a similar way to that described in the above post: i.e. continual restarts with the error message "exited with zero status but no 'finished' file" in the Event log coinciding with "No heartbeat from core client for 30 sec - exiting" in the Task details log.

A couple of extra observations though:

1) Quitting and restarting Boinc had no effect.
2) Suspending all other tasks did do the trick: the task successfully ran to completion.

It's one of those monster hyb_* tasks that required 11 hours to complete a single decoy, and although it returned a status of valid, the error messages at the end of the log do suggest a lingering problem.

WARNING! cannot get file size for default.out.gz: could not open file.
Output exists: default.out.gz Size: -1
InternalDecoyCount: 0 (GZ)
-----
0
-----
Stream information inconsistent.
Writing W_0000001
======================================================
DONE :: 1 starting structures 36398.7 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
called boinc_finish

</stderr_txt>
]]>
Validate state Valid

(Mac OS X 10.6.8/Boinc 7.0.31)
ID: 74548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
John Wood

Send message
Joined: 11 Aug 08
Posts: 1
Credit: 3,222,968
RAC: 0
Message 74698 - Posted: 9 Dec 2012, 17:15:33 UTC

I get this "exited with zero status......" all the time

09/12/2012 13:44:39 | rosetta@home | Task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 exited with zero status but no 'finished' file
09/12/2012 13:44:39 | rosetta@home | If this happens repeatedly you may need to reset the project.
09/12/2012 14:17:44 | rosetta@home | Task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 exited with zero status but no 'finished' file
09/12/2012 14:17:44 | rosetta@home | If this happens repeatedly you may need to reset the project.
09/12/2012 14:17:44 | rosetta@home | Restarting task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 using minirosetta version 345 in slot 1
09/12/2012 14:31:11 | rosetta@home | Task 5srsmn_3399m2_abinitio_SAVE_ALL_OUT_66557_2003_0 exited with zero status but no 'finished' file
09/12/2012 14:31:11 | rosetta@home | If this happens repeatedly you may need to reset the project.
09/12/2012 14:31:11 | rosetta@home | Restarting task 5srsmn_3399m2_abinitio_SAVE_ALL_OUT_66557_2003_0 using minirosetta version 345 in slot 0
09/12/2012 14:40:36 | rosetta@home | Task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 exited with zero status but no 'finished' file
09/12/2012 14:40:36 | rosetta@home | If this happens repeatedly you may need to reset the project.
09/12/2012 14:40:36 | rosetta@home | Restarting task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 using minirosetta version 345 in slot 1
09/12/2012 15:16:41 | rosetta@home | Task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 exited with zero status but no 'finished' file
09/12/2012 15:16:41 | rosetta@home | If this happens repeatedly you may need to reset the project.
09/12/2012 15:16:41 | rosetta@home | Restarting task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 using minirosetta version 345 in slot 1
09/12/2012 16:14:42 | rosetta@home | Task 3helix_2_22_abinitio_SAVE_ALL_OUT_67402_2169_0 exited with zero status but no 'finished' file
09/12/2012 16:14:42 | rosetta@home | If this happens repeatedly you may need to reset the project.

there seems to be a problem but the results posted back seem OK as I get credit.
my machine is not so busy so why the need to be 'restarting' so often?
I did previously reset my project several times until I realised it made no difference, just wasted useful work that was mostly completed.
Can anyone explain what is happening,please?

ID: 74698 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : exited with zero status but no 'finished' file



©2024 University of Washington
https://www.bakerlab.org