Possible bug in the "Average Processing Rate" calculation.

Message boards : Number crunching : Possible bug in the "Average Processing Rate" calculation.

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95703 - Posted: 1 May 2020, 23:42:18 UTC
Last modified: 2 May 2020, 0:33:54 UTC

As we know, a longer target run-time in Rosetta@home means more work is done per task.
However, it appears that the average processing rate DOES NOT take that into consideration. The longer you set your target runtime, the lower your APR drops.
This is an issue because, at really long runtimes, this can severely screw up the credits you get.
My main rig is by far the most powerful device I have, with a comical amount of cooling, and its APR is a measly 0.70-0.84 Gflops (measuring floating-point os 5.09-5.27 Gflops). That's with a target runtime of 36 hours. Back when the target run-time was 24 hours, the APR was over 1 Gflops.
My laptop also suffers from the same issue. It helps that I've changed the target runtime between app releases.

https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=2263735

Rosetta 4.08 x86_64-apple-darwin
Number of tasks completed 3
Max tasks per day 503
Number of tasks today 0
Consecutive valid tasks 3
Average processing rate 3.94 GFLOPS
Average turnaround time 2.07 days

Rosetta 4.09 x86_64-apple-darwin
Number of tasks completed 111
Max tasks per day 505
Number of tasks today 0
Consecutive valid tasks 5
Average processing rate 3.72 GFLOPS
Average turnaround time 2.06 days

Rosetta 4.12 x86_64-apple-darwin
Number of tasks completed 2
Max tasks per day 502
Number of tasks today 0
Consecutive valid tasks 2
Average processing rate 0.95 GFLOPS
Average turnaround time 1.59 days

Rosetta 4.15 x86_64-apple-darwin
Number of tasks completed 9
Max tasks per day 501
Number of tasks today 0
Consecutive valid tasks 1
Average processing rate 0.65 GFLOPS
Average turnaround time 2.12 days

Rosetta 4.16 x86_64-apple-darwin
Number of tasks completed 26
Max tasks per day 516
Number of tasks today 3
Consecutive valid tasks 16
Average processing rate 0.65 GFLOPS
Average turnaround time 1.97 days

Up until version 4.12, the target runtime was set to the default of 6 hours (it's now 8 hours). When 4.12 came out, I set the target runtime to 24 hours to decrease the load on the server. In the latest versions, I increased the target run-time to 36 hours. I do not believe that drastic changes in the APR could be caused by a difference in software versions alone.

My HTC 2Q55100 (HTC U12+) has the target run-time set to default has its APR sitting at 2.28 GFLOPS.

Conversely, it is possible to achieve impossibly high APRs by setting your target runtime very short.

An example is this computer:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3684192
Measured floating point speed 4010.38 million ops/sec
Measured integer speed 17336.69 million ops/sec (judging by its exclusion on the statistics page, I believe this is useless for Rosetta)
Average processing rate 10.33 GFLOPS
That's with a typical task runtime of 2 hours

If you're wondering why your APR is lower in 4.15 than previous versions, and you've kept your target run-time at default, that's why. When 4.15 came out, the default runtime was extended from 6 hours to 8 hours.

To recreate this issue, set your target runtime to something unreasonably long or short, and watch your APR drop or shoot up.
ID: 95703 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95760 - Posted: 2 May 2020, 4:09:40 UTC - in response to Message 95703.  

Now that 4.20 is officially out for Macs, I've adjusted my target runtime to the default value. If my theory on the issue is correct, the ARP for Rosetta 4.20 x86_64-apple-darwin should be about 2.9Gflops. Assuming the new version does not run much faster or slower.
https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=2263735
ID: 95760 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95790 - Posted: 2 May 2020, 11:40:51 UTC - in response to Message 95760.  

Before and after shortening the target run-time from 36 hours to 8 hours.

https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=2263735

I made sure that the only task that was set for 8 hours was a 4.20 task. Since I set my target run-time back to 36 hours after that task was uploaded, I expect my APR to drop soon.

Rosetta 4.16 x86_64-apple-darwin
Number of tasks completed 29
Max tasks per day 519
Number of tasks today 0
Consecutive valid tasks 19
Average processing rate 0.65 GFLOPS
Average turnaround time 1.96 days

Rosetta 4.20 x86_64-apple-darwin
Number of tasks completed 1
Max tasks per day 501
Number of tasks today 0
Consecutive valid tasks 1
Average processing rate 3.02 GFLOPS
Average turnaround time 0.41 days
ID: 95790 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95803 - Posted: 2 May 2020, 14:08:08 UTC - in response to Message 95790.  

Wait a minute...
the estimated compute size for Rosetta tasks is 80,000 Gflops

80,000gflops /26399 (the run times of that one 4.20 task my Mac submitted) = 3.0179 GFLOPS

Please don't tell me that's how the APR is actually calculated...
ID: 95803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95836 - Posted: 2 May 2020, 18:24:50 UTC - in response to Message 95790.  

Beware that runtime preference is applied to all of the WUs in your cache when you update to the project.
Rosetta Moderator: Mod.Sense
ID: 95836 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95842 - Posted: 2 May 2020, 20:14:36 UTC - in response to Message 95836.  
Last modified: 2 May 2020, 20:15:53 UTC

Beware that runtime preference is applied to all of the WUs in your cache when you update to the project.

I am aware of that. That's why I made sure only this one 4.20 task was the only workload in my cache when I updated the target run-time to 8 hours, and set it back after this task had started.
ID: 95842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin
Project administrator

Send message
Joined: 1 Jul 05
Posts: 4805
Credit: 0
RAC: 0
Message 95843 - Posted: 2 May 2020, 20:15:33 UTC
Last modified: 2 May 2020, 20:20:38 UTC

I'm testing an update to the scheduler on Ralph@h that uses the cpu run time preference when it decides to send jobs. Hopefully this update will provide a more accurate job cache. But yes, keep in mind, if you change the cpu run time preference, all jobs will use the updated value once the client gets the information from the server.

If I'm understanding how the scheduler decides how many jobs to send correctly and the update works as expected, I'll update the R@h scheduler. I don't think this update will change the remaining time estimates displayed on the BOINC client, as these may still be inaccurate.
ID: 95843 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95844 - Posted: 2 May 2020, 20:17:42 UTC - in response to Message 95843.  
Last modified: 2 May 2020, 20:21:07 UTC

I'm testing an update to the scheduler on Ralph@h that uses the cpu run time preference when it decides to send jobs. Hopefully this update will provide a more accurate job cache. But yes, keep in mind, if you change the cpu run time preference, all jobs will use the updated value once the client gets the information from the server.


Hmm, I've never had the watch dog kick in and end tasks when I update the target run-time before. In my experience, it only gets applied to tasks that haven't been started. At least on my main rig, weird...
ID: 95844 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95845 - Posted: 2 May 2020, 20:23:51 UTC - in response to Message 95843.  
Last modified: 2 May 2020, 20:25:17 UTC


If I'm understanding how the scheduler decides how many jobs to send correctly and the update works as expected, I'll update the R@h scheduler. I don't think this update will change the remaining time estimates displayed on the BOINC client, as these may still be inaccurate.


Hmm, I'll stop tasks from Rosetta and focus on Ralph to test it out. Do you think it's safe to set a larger cache to test that?
ID: 95845 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin
Project administrator

Send message
Joined: 1 Jul 05
Posts: 4805
Credit: 0
RAC: 0
Message 95846 - Posted: 2 May 2020, 20:25:07 UTC - in response to Message 95844.  

If you shorten the run time, a currently running job that has already exceeded the runtime will end when the current model being calculated is finished. If you lengthen the run time, the currently running job will run up to that limit within a resolution of the average time per model.

But only if the BOINC client communicates with the server so it can update the preferences stored on the client.
ID: 95846 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95848 - Posted: 2 May 2020, 20:28:07 UTC - in response to Message 95846.  

If you shorten the run time, a currently running job that has already exceeded the runtime will end when the current model being calculated is finished. If you lengthen the run time, the currently running job will run up to that limit within a resolution of the average time per model.

But only if the BOINC client communicates with the server so it can update the preferences stored on the client.


Thanks, it didn't seem to work as intended on my end, unfortunately...

https://boinc.bakerlab.org/rosetta/result.php?resultid=1166905787 When this task was running, I set the target run-time to 36 hours, then I updated the project. It still got ended prematurely. Though, interestingly, it ended longer than the previous target run-time, which is 8 hours.
ID: 95848 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95850 - Posted: 2 May 2020, 20:33:40 UTC - in response to Message 95844.  

Not sure why you mention the watchdog. But, yes, the watchdog may get in your way as runtime preference is changed. For example, if you currently are running 18 hour runtime preference, and then you change it to a 4 hour runtime preference, any tasks you currently have that have run for more than 14 hours of CPU time will suddenly appear to be running beyond their preferred runtime by more than 10 hours, and the watchdog will end them.
Rosetta Moderator: Mod.Sense
ID: 95850 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95852 - Posted: 2 May 2020, 20:39:30 UTC - in response to Message 95850.  
Last modified: 2 May 2020, 20:41:51 UTC

Not sure why you mention the watchdog. But, yes, the watchdog may get in your way as runtime preference is changed. For example, if you currently are running 18 hour runtime preference, and then you change it to a 4 hour runtime preference, any tasks you currently have that have run for more than 14 hours of CPU time will suddenly appear to be running beyond their preferred runtime by more than 10 hours, and the watchdog will end them.


Hmm, that didn't seem to happen, although my understanding is that it should (back in the days when the default run-time was 6 hours, I vaguely remember that being the case). When 4.12 and 4.15 came out and I had issues with too many tasks, I shortened the target run-time from 24 hours to 12 hours. I made sure to update the project, but the tasks that had started when I changed the run-time still happily ran for the full 24 hours. This time is the opposite, I set the target run-time to 36 hours after that 8 hour one got reported, so the last task would run for 36 hours. It still ran for only 12 hours. I DID make sure to update the project after I change settings, since the web-page tells you to do so in red whenever you change a setting.
ID: 95852 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95853 - Posted: 2 May 2020, 20:42:09 UTC - in response to Message 95852.  

I do try to avoid changing the runtime preference, but I've seen similar. Have not done it enough times to feel I understand the pattern. Seems that sometimes, after you update to the project to get the new runtime preference, you also have to end and restart BOINC in order for it to really take effect on the current tasks. Is that what you are seeing as well?
Rosetta Moderator: Mod.Sense
ID: 95853 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95854 - Posted: 2 May 2020, 20:44:25 UTC - in response to Message 95853.  
Last modified: 2 May 2020, 20:47:04 UTC

I do try to avoid changing the runtime preference, but I've seen similar. Have not done it enough times to feel I understand the pattern. Seems that sometimes, after you update to the project to get the new runtime preference, you also have to end and restart BOINC in order for it to really take effect on the current tasks. Is that what you are seeing as well?


Ah, I did not think to restart BOINC after messing with the run-time settings, especially when I realized restarting BOINC seems to cause certain HalfRoid COVID-19 to result in a computation error. Tasks that start after the run-time settings get updated do reliably stick with the target run-time, though.
ID: 95854 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Admin
Project administrator

Send message
Joined: 1 Jul 05
Posts: 4805
Credit: 0
RAC: 0
Message 95860 - Posted: 2 May 2020, 21:09:19 UTC

Are you setting the preference for the right host venue/location? Currently running jobs should update the cpu run time setting so I'm not sure what is happening with your host(s). It does not look like they are prematurely ending due to some other condition.
ID: 95860 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 393
Credit: 12,110,248
RAC: 4,952
Message 95864 - Posted: 2 May 2020, 21:25:06 UTC - in response to Message 95843.  

I'm testing an update to the scheduler on Ralph@h that uses the cpu run time preference when it decides to send jobs. Hopefully this update will provide a more accurate job cache. But yes, keep in mind, if you change the cpu run time preference, all jobs will use the updated value once the client gets the information from the server.

If I'm understanding how the scheduler decides how many jobs to send correctly and the update works as expected, I'll update the R@h scheduler. I don't think this update will change the remaining time estimates displayed on the BOINC client, as these may still be inaccurate.


I have my dual core laptop set to use a single core for Ralph as its only project with a buffer of 0.2 + 0.2 and the default run time preference and it’s just downloaded 10 WUs claiming an estimated run time of 2:30.
ID: 95864 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95865 - Posted: 2 May 2020, 21:27:51 UTC - in response to Message 95860.  
Last modified: 2 May 2020, 21:32:20 UTC

Are you setting the preference for the right host venue/location? Currently running jobs should update the cpu run time setting so I'm not sure what is happening with your host(s). It does not look like they are prematurely ending due to some other condition.


I double checked and the problem with my Mac seems to be my mistake, sorry. I either forgot to click save, or the internet crapped out when it was saving, sorry. The issue did happen on two occasions (when 4.12 and 4.15 came out, respectively) on my main rig. My main rig was set to the "home" venue, and when I edited the values there, it didn't update for tasks that had started, no matter how many times I pressed update. It worked only for tasks that started after the updated run-time.
For example, when 4.12 came out, my previous target run-time was set to 24 hours, when BOINC downloaded a bit too many tasks, I set the target run-time to 12 hours. The tasks that had started after I changed the settings (even those that begun one or two hours prior) all defiantly ran until 24 hours, the ones that started after the change ran for around 12 hours.
ID: 95865 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95866 - Posted: 2 May 2020, 21:29:21 UTC - in response to Message 95860.  

In my case, yes, I've got the right venue. Proven when you do eventually see the runtime reflected in tasks that begin after the project update. But it seems if the task was already in memory, and not kicked out if/when suspended, then it still runs with old preference. But if you end and start BOINC, forcing the task out of and back in to memory to start, it then modifies estimated time to completion and correctly adopts the new runtime for the task.
Rosetta Moderator: Mod.Sense
ID: 95866 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 95868 - Posted: 2 May 2020, 21:36:30 UTC - in response to Message 95866.  
Last modified: 2 May 2020, 21:37:46 UTC

In my case, yes, I've got the right venue. Proven when you do eventually see the runtime reflected in tasks that begin after the project update. But it seems if the task was already in memory, and not kicked out if/when suspended, then it still runs with old preference. But if you end and start BOINC, forcing the task out of and back in to memory to start, it then modifies estimated time to completion and correctly adopts the new runtime for the task.


That may be the case on my main rig. Updating the target time setting didn't seem to affect tasks that had already started. Simply suspending and resuming does not work. I've never tried to restart BOINC and see if that made a difference.
ID: 95868 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Possible bug in the "Average Processing Rate" calculation.



©2024 University of Washington
https://www.bakerlab.org