These 7 files will not upload.

Message boards : Number crunching : These 7 files will not upload.

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
D.A. Pinniger

Send message
Joined: 26 Jan 11
Posts: 2
Credit: 1,027,601
RAC: 0
Message 71974 - Posted: 7 Jan 2012, 16:40:17 UTC

Are due 1-17-2012. Any help to resolve this issue would be greatly appreciated. Thank you.

rosetta@home ab_11_29__optpps_T5781_optpps_03_09_35686_149833_0_0 0.000 45.11 K 00:00:34 - 65:51:13 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
rosetta@home ab_11_29__optpps_T6211_optpps_03_09_35686_150999_0_0 0.000 40.13 K 00:00:34 - 70:45:04 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
rosetta@home ab_11_29__optpps_T5781_optpps_03_09_35686_149852_0_0 0.000 44.99 K 00:00:29 - 66:32:40 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
rosetta@home ab_11_29__optpps_T6211_optpps_03_09_35686_150993_0_0 0.000 40.20 K 00:00:27 - 67:37:49 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
rosetta@home ab_11_29__optpps_T6211_optpps_03_09_35686_151002_0_0 0.000 45.72 K 00:00:21 - 69:54:28 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
rosetta@home ab_11_29__optpps_T5781_optpps_03_09_35686_149834_0_0 0.000 45.47 K 00:00:23 - 64:57:58 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
rosetta@home ab_11_29__optpps_T6211_optpps_03_09_35686_150998_0_0 0.000 45.72 K 00:00:20 - 69:21:54 0.00 Kbps Upload pending (Project backoff: 00:04:36) Dave
ID: 71974 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,523,428
RAC: 9,566
Message 71975 - Posted: 7 Jan 2012, 16:43:04 UTC

The servers will be swamped with uploads - it'll just take some time.
ID: 71975 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jesse Viviano

Send message
Joined: 14 Jan 10
Posts: 42
Credit: 2,700,472
RAC: 0
Message 71979 - Posted: 7 Jan 2012, 20:02:14 UTC
Last modified: 7 Jan 2012, 20:04:11 UTC

I am having a similar problem with one file that will not upload, but when I check the log, it statest that my machine cannot resolve the DNS address of the upload server. Is there a DNS misconfiguration? I was able to upload two other files and report them successfully, so I am guessing that there might be multiple upload servers with one of them having a bad DNS configuration at the DNS server.
ID: 71979 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave Mickey

Send message
Joined: 29 Dec 07
Posts: 33
Credit: 4,136,957
RAC: 0
Message 71980 - Posted: 7 Jan 2012, 21:58:52 UTC

I think I conclude what the OP and Jesse did - there is a set of 13 results waiting to upload, and they always fail with can't resolve. Subsequent WU's process, upload, and report, and these 13 are stuck in the past, unable to move on.

Does a rah wu upload file contain something that would maybe cause it to try to go to an obsolete host name, after the reconfig got done at UW? Like are they coded with a host name that now does not exist, or is not in DNS servers any where?

Does anyone know if there is a CC debug switch we could turn on to see exactly what host name the failed units are attempting to use?

Dave
ID: 71980 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Holmis

Send message
Joined: 15 Nov 07
Posts: 6
Credit: 975,490
RAC: 0
Message 71982 - Posted: 7 Jan 2012, 23:00:30 UTC - in response to Message 71980.  

I think I conclude what the OP and Jesse did - there is a set of 13 results waiting to upload, and they always fail with can't resolve. Subsequent WU's process, upload, and report, and these 13 are stuck in the past, unable to move on.

Does a rah wu upload file contain something that would maybe cause it to try to go to an obsolete host name, after the reconfig got done at UW? Like are they coded with a host name that now does not exist, or is not in DNS servers any where?

Does anyone know if there is a CC debug switch we could turn on to see exactly what host name the failed units are attempting to use?

Dave


Hi

I've also got one file that fails to upload and it's trying this URL:
http://srv6.bakerlab.org/rosetta_cgi/file_upload_handler

My cc_config.xml looks like this:
<cc_config>
  <log_flags>
      <cpu_sched>1</cpu_sched>
      <cpu_sched_debug>0</cpu_sched_debug>
      <dcf_debug>1</dcf_debug>
      <sched_op_debug>1</sched_op_debug>
      <file_xfer_debug>1</file_xfer_debug>
  </log_flags>
  <options>
      <zero_debts>0</zero_debts>
  </options>
</cc_config>

I think that "file_xfer_debug" is the one you want.
This is what I see in the eventlog (or message tab):
07/01/2012 23:47:50 | rosetta@home | [fxd] starting upload, upload_offset -1
07/01/2012 23:47:50 | rosetta@home | Started upload of ab_11_29__optpps_T5781_optpps_03_09_35686_134357_0_0
07/01/2012 23:47:50 | rosetta@home | [file_xfer] URL: http://srv6.bakerlab.org/rosetta_cgi/file_upload_handler
07/01/2012 23:47:51 |  | Project communication failed: attempting access to reference site
07/01/2012 23:47:51 | rosetta@home | [file_xfer] http op done; retval -113 (can't resolve hostname)
07/01/2012 23:47:51 | rosetta@home | [file_xfer] file transfer status -113 (can't resolve hostname)
07/01/2012 23:47:51 | rosetta@home | Temporarily failed upload of ab_11_29__optpps_T5781_optpps_03_09_35686_134357_0_0: can't resolve hostname
07/01/2012 23:47:51 | rosetta@home | Backing off 11 hr 38 min 16 sec on upload of ab_11_29__optpps_T5781_optpps_03_09_35686_134357_0_0

When one gets a task assigned the server sends information to the client on where to upload the result, that information is stored in the file client_state.xml. It seems that different tasks have different addresses for upload, at least that's what I see in my state file. If you open your client_state.xml do take care not to save any changes as it my very well cause you to lose all work or worse...

/Johan
ID: 71982 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 71984 - Posted: 8 Jan 2012, 0:24:54 UTC

I have several that don't upload on different machines. One gives this message:
1/8/2012 1:20:56 AM rosetta@home Temporarily failed upload of ab_11_29__optpps_T5781_optpps_03_09_35686_181530_0_0: can't resolve hostname

I guess its a thing the admins need to solve.
Greetings,
TJ.
ID: 71984 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave Mickey

Send message
Joined: 29 Dec 07
Posts: 33
Credit: 4,136,957
RAC: 0
Message 71985 - Posted: 8 Jan 2012, 0:25:21 UTC

very interesting. when I turn on that debug, I see it using the same url as Holmis. And when I went to ping it, it returned as

C:Usersdwmickey>ping srv6.bakerlab.org

Pinging srv6.bakerlab.org [67.215.65.132] with 32 bytes of data:
Reply from 67.215.65.132: bytes=32 time=36ms TTL=53
Reply from 67.215.65.132: bytes=32 time=34ms TTL=53
Reply from 67.215.65.132: bytes=32 time=34ms TTL=53
Reply from 67.215.65.132: bytes=32 time=33ms TTL=53

Ping statistics for 67.215.65.132:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 33ms, Maximum = 36ms, Average = 34ms


BUT THEN, I did a flushdns via ipconfig, and it no longer returned. Now, I see that the IP that was returning above is now, really,

C:Usersdwmickey>tracert 67.215.65.132

Tracing route to hit-nxdomain.opendns.com [67.215.65.132]
over a maximum of 30 hops:


It looks like SRV6 is no more, in dns land. and if you look at the rah server status page, it looks like it should all be going thru srv4.

And indeed, looking at the file debug for uploads that work then are asking to go to :

/////////////////////////////////////////////////////
07-Jan-2012 16:50:59 [rosetta@home] Started upload of _11_29__optpps_T6161_optpps_03_09_35686_140788_0_0
07-Jan-2012 16:50:59 [rosetta@home] [file_xfer_debug] URL: http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler
//////////////////////////////////////////////////////


and that works.......

So what's to become of these units, apparently cast aside by the server reconfig......??? They want to phone home to srv6, but alas, there is none!

Dave


ID: 71985 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 71987 - Posted: 8 Jan 2012, 0:35:52 UTC - in response to Message 71985.  

and that works.......

So what's to become of these units, apparently cast aside by the server reconfig......??? They want to phone home to srv6, but alas, there is none!

Dave


Hi Dave, thanks for your explanetion.
Is there a way we can solve the problem ourselves?
Greetings,
TJ.
ID: 71987 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave Mickey

Send message
Joined: 29 Dec 07
Posts: 33
Credit: 4,136,957
RAC: 0
Message 71990 - Posted: 8 Jan 2012, 0:59:51 UTC

Apparently, yes, there is. put this line in your "hosts" file - somewhere under windows, just "hosts", with no extension:

128.95.160.145 srv6.bakerlab.org

just by itself. Then, requests to srv6 will go to srv4. All mine are gone now......


Dave
ID: 71990 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 71992 - Posted: 8 Jan 2012, 1:20:17 UTC - in response to Message 71990.  

Apparently, yes, there is. put this line in your "hosts" file - somewhere under windows, just "hosts", with no extension:

128.95.160.145 srv6.bakerlab.org

just by itself. Then, requests to srv6 will go to srv4. All mine are gone now......


Dave


Hi Dave, thanks but I am not good with software.
Where can I find this "host" file?
I did a search on the harddisk with no useable result.
Can and will you help a little more?
Thanks, TJ
ID: 71992 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave Mickey

Send message
Joined: 29 Dec 07
Posts: 33
Credit: 4,136,957
RAC: 0
Message 71993 - Posted: 8 Jan 2012, 1:30:23 UTC

Where it is might be variable based on your Windows rev, but

I have windows 7, and mine is found in the directory named:

c:windowssystem32driversetc

the filename is

hosts

with no extension like .txt or .bin or anything. It is plain text, so you can open it with notepad, or any simple text editor.

In a search or find, look for name hosts.

put the line I quoted before, by itself on the last line of the file, not disturbing any other lines. If you want, you can put a comment line above your new line for later explanation, like:

# this entry is to fix a problem with rosetta
128.95.160.145 srv6.bakerlab.org

There is one or more spaces or tabs between the .145 and srv6, and finish the line with a RETURN

Save the file.

Now, go restart the upload of the stuck file(s). If your problem is the same as mine, they will work now.

Dave

If you're nervous about fooling with the file, copy it before editing, and then you can easily put it back the way you found it.

ID: 71993 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 71994 - Posted: 8 Jan 2012, 1:42:04 UTC - in response to Message 71993.  

Where it is might be variable based on your Windows rev, but

I have windows 7, and mine is found in the directory named:

c:windowssystem32driversetc

the filename is

hosts

with no extension like .txt or .bin or anything. It is plain text, so you can open it with notepad, or any simple text editor.

In a search or find, look for name hosts.

put the line I quoted before, by itself on the last line of the file, not disturbing any other lines. If you want, you can put a comment line above your new line for later explanation, like:

# this entry is to fix a problem with rosetta
128.95.160.145 srv6.bakerlab.org

There is one or more spaces or tabs between the .145 and srv6, and finish the line with a RETURN

Save the file.

Now, go restart the upload of the stuck file(s). If your problem is the same as mine, they will work now.

Dave

If you're nervous about fooling with the file, copy it before editing, and then you can easily put it back the way you found it.



Thank you Dave.
It is a clear instruction. Great work! I have win7 as well and now my wu's are gone.
Greetings,
TJ.
ID: 71994 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
edikl

Send message
Joined: 16 Jun 10
Posts: 10
Credit: 186,187
RAC: 0
Message 71995 - Posted: 8 Jan 2012, 1:48:24 UTC

Hi!
I can confirm that your advice works perfectly under Windows Vista as well.
Thanks a lot :)
ID: 71995 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 71996 - Posted: 8 Jan 2012, 1:53:23 UTC

I would suggest that it is likely that this one upload server simply has not completed it's upgrade yet, and that it will come back online soon. And therefore no action is required. When the server comes back online, BOINC will finally have a retry that works.

To do other things risks corrupting files which potentially effects your whole boat of tasks.

Aborting the transfer will be throwing away the work you've done, and the credit you've earned for that work.

...having said that, the suggestion below to hit an alternate upload server should be processed normally if you are comfortable achieving the redirection via the hosts file, etc.
Rosetta Moderator: Mod.Sense
ID: 71996 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 71998 - Posted: 8 Jan 2012, 2:03:29 UTC - in response to Message 71996.  

I would suggest that it is likely that this one upload server simply has not completed it's upgrade yet, and that it will come back online soon. And therefore no action is required. When the server comes back online, BOINC will finally have a retry that works.

To do other things risks corrupting files which potentially effects your whole boat of tasks.

Aborting the transfer will be throwing away the work you've done, and the credit you've earned for that work.

...having said that, the suggestion below to hit an alternate upload server should be processed normally if you are comfortable achieving the redirection via the hosts file, etc.


Thanks Mod.Sense,

It would have helped a lot when you mentioned this earlier, even better when it is on the main page.
As you can imagine a lot of crunchers have this issue.
Greetings,
TJ.
ID: 71998 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave Mickey

Send message
Joined: 29 Dec 07
Posts: 33
Credit: 4,136,957
RAC: 0
Message 72006 - Posted: 8 Jan 2012, 13:33:01 UTC


I think that you are asking for validate errors with this method because it is quite possible that the server that a work unit is assigned to is the only one that can validate it.
.....
To do other things risks corrupting files which potentially effects your whole boat of tasks.

Aborting the transfer will be throwing away the work you've done, and the credit you've earned for that work.

...having said that, the suggestion below to hit an alternate upload server should be processed normally if you are comfortable achieving the redirection via the hosts file, etc.




I looked for, and found my 13 holdouts, all reported at just about 1:00 UTC, and they all are "Over, Success, Done, and granted credit". So maybe I dodged a bullet, but I would guess that a robust parallel system like boinc would not have fragile path such as work having to go back into one and only one IP address. But I can't claim any expertise, just luck, I guess. But, from ModSenses comment, there is no "etc", it was just mod the host file. Period.

typical result record:


473457597 431993511 28 Dec 2011 17:41:51 UTC 8 Jan 2012 0:57:37 UTC Over Success Done 27,876.58 198.47 154.33


YMMV, I guess, but no sign of trouble here.

Dave
ID: 72006 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ukjohnd

Send message
Joined: 22 Jul 06
Posts: 1
Credit: 696,728
RAC: 0
Message 72014 - Posted: 8 Jan 2012, 18:18:36 UTC

Perfect fixed DNS issues for me
ID: 72014 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 72023 - Posted: 9 Jan 2012, 0:10:01 UTC

I had this same problem with my Ubuntu Linux pc,
I have been hitting the `update` and `retry now` buttons plenty this last day, with no affect.
Though the fix did not need any hosts config fun.
In the end just restart the computer did it, (which is something i only usualy do for a kernel update).
Then hit the `retry now` button for each task,
Everything uploaded, and new work is downloading.
Sorted :¬)
ID: 72023 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2117
Credit: 41,158,554
RAC: 15,699
Message 72025 - Posted: 9 Jan 2012, 2:47:44 UTC

After 2 days of failed upload attempts on WUs due 11th & 12th I used this solution & it worked. I removed the line from HOSTS straight after as no other WUs had this problem.

Thanks for details of the workaround - much appreciated.
ID: 72025 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ironworker16

Send message
Joined: 31 May 06
Posts: 3
Credit: 9,758,247
RAC: 0
Message 72026 - Posted: 9 Jan 2012, 3:40:57 UTC - in response to Message 71993.  

Where it is might be variable based on your Windows rev, but

I have windows 7, and mine is found in the directory named:

c:windowssystem32driversetc

the filename is

hosts

with no extension like .txt or .bin or anything. It is plain text, so you can open it with notepad, or any simple text editor.

In a search or find, look for name hosts.

put the line I quoted before, by itself on the last line of the file, not disturbing any other lines. If you want, you can put a comment line above your new line for later explanation, like:

# this entry is to fix a problem with rosetta
128.95.160.145 srv6.bakerlab.org

There is one or more spaces or tabs between the .145 and srv6, and finish the line with a RETURN

Save the file.


Thanks, That was quick and easy. I did have to open notepad as administrator to edit the file.
ID: 72026 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : These 7 files will not upload.



©2024 University of Washington
https://www.bakerlab.org