Message boards : Number crunching : Client error with Rosetta Mini 3.19
Author | Message |
---|---|
Rayburner Send message Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0 |
Hello, I am having problems with a new host I just attached to rosetta. All WU I report show outcome client error. For example task ID 472808156 (https://boinc.bakerlab.org/rosetta/result.php?resultid=472808156) However according to the stderr out (see below) I don't have a clue what the Problem is. The spefic host is new, no OC and crunching successfully for several different projects (Einstein, SETI, WCG, LHC, Primegrid) Has anybody a clue why this happens? I stopped rosetta on this host for now. Best Regards, Rayburner <core_client_version>6.12.34</core_client_version> <![CDATA[ <stderr_txt> [2011-12-25 20:14:32:] :: BOINC:: Initializing ... ok. [2011-12-25 20:14:32:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev46494.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting watchdog... Watchdog active. Starting work on structure: _00001 # cpu_run_time_pref: 14400 Starting work on structure: _00002 Starting work on structure: _00003 Starting work on structure: _00004 Starting work on structure: _00005 Starting work on structure: _00006 Starting work on structure: _00007 Starting work on structure: _00008 Starting work on structure: _00009 Starting work on structure: _00010 Starting work on structure: _00011 Starting work on structure: _00012 Starting work on structure: _00013 Starting work on structure: _00014 Starting work on structure: _00015 Starting work on structure: _00016 ====================================================== DONE :: 1 starting structures 13863.3 cpu seconds This process generated 16 decoys from 16 attempts ====================================================== BOINC :: WS_max 4.09907e+008 BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> |
JKitterman Send message Joined: 21 Oct 05 Posts: 11 Credit: 814,463 RAC: 0 |
It looks like they are failing validation, if my guess is correct. I didn't see any other issues and your output looks comparable to my successfully completed workunits. It may be a validation error or you can try resetting your project on this computer. |
JKitterman Send message Joined: 21 Oct 05 Posts: 11 Credit: 814,463 RAC: 0 |
I took a second look at your results. I find it odd that your results are spending more CPU time about 13,000 seconds compare to mine a little over 10,000 seconds. You claim a lot more credit of about 125 and I claim about 75. I would expect your computer to be faster than mine. Is your BOINC setup completely stock or are you using customer parameters or something? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The line in the output here: # cpu_run_time_pref: 14400 just indicates that Rosetta preferences have been set to indicate that the preferred runtime is 14400 seconds (4 hrs. rather than the default of 3). So this is why the tasks are running a bit longer than 3 hours, and also why they tend to claim more credit for 4 hours of work rather than 3. Several of the tasks have the following shown after the number of decoys summary: BOINC :: WS_max 3.97197e+008 I'm not positive what this tells us. I don't see any other errors in the messages. Some additional observations, the host has never been granted credit. The host has 8 CPUs and 8GB of memory. The host is running Win7 with BOINC 6.12.34 There have been several reports of problems with displaying the graphic. So I would suggest not using BOINC as you screensaver, and not displaying the graphic just to see if this makes any difference. Rosetta Moderator: Mod.Sense |
Rayburner Send message Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0 |
I have detached and reatached to the project, run time set to 1 hour, screensaver is not active The result is still the same. Outcome is client error My second host in the meantime is generating credits. Regards, Rayburner |
Rayburner Send message Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0 |
It looks like the missing credits were granted by the project (automatically??), but still all new returned results by this host are marked as outcome client error. As credits were granted I assume my returned results are valuable to the project. So how are we going to proceed? I think I have checked everything on my side. Is it a problem on the server side? Does it make sense to delete this host on the server and let it create a new id by contacting the server again? Regards, Rayburner |
JKitterman Send message Joined: 21 Oct 05 Posts: 11 Credit: 814,463 RAC: 0 |
It looks like you are now running the Beta Boinc client <core_client_version>7.0.3</core_client_version> In looking at some previous results on this bad host, none of them show a application version at the end of the Task Details online. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Detaching has reloaded everything on your end, so I wouldn't do anything more there. Sorta sounds like the problem is on the server side in the validator. Returned results are ALWAYS valuable to the project and that is part of why they wrote a script to grant credit is cases that BOINC defaults would not. If nothing else, the result is revealing a problem in the validator. But from what I could see in the output, it looked like your machine was crunching successful models as well, so those will definitely be useful. Rosetta Moderator: Mod.Sense |
Rayburner Send message Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0 |
It looks like you are now running the Beta Boinc client right. I installed the beta BOINC client hoping that it would help... However the problem was the same with 6.12.34 like Mod.Sense has written looks like the validator seems to have a problem analyzing my returned result. I guess that is why no application version is displayed. Regards, Rayburner |
Rayburner Send message Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0 |
now this host has reached a daily quota of 8 because it always returns "bad" results. Looks I have to take this host out of rosetta. It doesn't make sense to keep it attached as long this problem persists. Regards, Rayburner |
Message boards :
Number crunching :
Client error with Rosetta Mini 3.19
©2024 University of Washington
https://www.bakerlab.org