Message boards : Number crunching : CPU App Performance
Author | Message |
---|---|
PappaLitto Send message Joined: 14 Nov 17 Posts: 17 Credit: 28,135,915 RAC: 1,451 |
Hello, I was wondering if the same App (Rosetta 4.07 and Rosetta Mini) perform better on Linux over on windows? Was the app written for Linux and somehow ported to windows or running using a hypervisor and VM to run on windows. Or is the application actually written to run on both Linux and windows? |
Paul Send message Joined: 29 Oct 05 Posts: 193 Credit: 66,348,082 RAC: 8,353 |
I have Linux, Windows and Macs running the client. All of them work great. The Linux client might be a bit more efficient. I have never benchmarked two identical computers long term so I can't provide a true comparison. Hope you enjoy crunching for this project. Thx! Paul |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I run Rosetta on two dedicated i7-3770 machines that are identical in hardware, and the cores are fully utilized. The 12 hour work units run on a Win7 64-bit machine: https://boinc.bakerlab.org/results.php?hostid=3381276 The 24 hour work units run on an Ubuntu 16.04 machine: https://boinc.bakerlab.org/results.php?hostid=3285911 Since the Ubuntu machine has more than twice the credits per work unit of the Windows machine, it seems that Ubuntu is somewhat more efficient (to the extent that BOINC credits can be trusted). But the companion projects may have something to do with it also. The Win7 machine also runs CPDN, while the Ubuntu machine also runs GPUGrid/QC. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Since the Ubuntu machine has more than twice the credits per work unit of the Windows machine, it seems that Ubuntu is somewhat more efficient (to the extent that BOINC credits can be trusted). Actually, the Windows work units have been more consistent than the Ubuntu ones. The Windows average about 294 PPD, while the Ubuntu ones are 492 PPD for the last 14 work units of each. Maybe that is just BOINC credits jumping around, but Windows looks relatively good. I will increase the Windows run time to 24 hours for further comparisons. |
PappaLitto Send message Joined: 14 Nov 17 Posts: 17 Credit: 28,135,915 RAC: 1,451 |
Since the Ubuntu machine has more than twice the credits per work unit of the Windows machine, it seems that Ubuntu is somewhat more efficient (to the extent that BOINC credits can be trusted). This is comparing a 12 hour Work Unit with a 24 hour one. I hope it would be double the credit. Also I noticed the credit seems to have no direct correlation between the CPU time. Work Units (WUs) with similar credit have have vastly different CPU times. I did a personal study of windows vs linux but came to no specific conclusion as it looked like the windows WUs were only slightly slower on average. Actually, the Windows work units have been more consistent than the Ubuntu ones. The Windows average about 294 PPD, while the Ubuntu ones are 492 PPD for the last 14 work units of each. What do you mean by more consistent? There are less fluctuations in credit? Is there the same CPU time windows vs linux when you're comparing the credit? |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Since the Ubuntu machine has more than twice the credits per work unit of the Windows machine, it seems that Ubuntu is somewhat more efficient (to the extent that BOINC credits can be trusted). "More consistent" means less jumping around. The initial comparison was done with 12 hour times for the Windows machine, and 24 hours for the Ubuntu machine. So ideally the Ubuntu machine should get twice the credits. Since it got MORE than twice the credits, it would seem to be more efficient. But it is probably better to use the same 24 hour times for both, so I increased it for the Windows machine. I think my later results also show there is not much difference between Windows and Linux; it would take a lot of data to really see the difference. So I am no longer doing that comparison. But I have found that more significant effects relate to work unit size, other projects being run at the same time as Rosetta, and number of CPU cores used. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12544 Those effects seem to be generally the same for Windows as for Linux. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,028,238 RAC: 7,099 |
Since the Ubuntu machine has more than twice the credits per work unit of the Windows machine, it seems that Ubuntu is somewhat more efficient (to the extent that BOINC credits can be trusted). Rosetta researchers chose one of the Rosetta models to investigate. The different models have different execution characteristics. That affects the CREDITS that Rosetta computes. If you have not reached the TARGET CPU TIME when you complete a DECOY, then Rosetta will run another loop. That affects the CREDITS awarded. If you are running other PROJECTS, then the interaction will affect the number of DECOYS you can complete in the TARGET CPU TIME. .... messes with CREDITS. Different projects affect Rosetta differently. IMO, Rosetta has a big code footprint and is sensitive to the ICACHE size, DCACHE size and memory speed. It is VERY! hard to get consistent Rosetta results for comparison. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
It is VERY! hard to get consistent Rosetta results for comparison. No doubt, for the reasons you mention. But the big hits in credits seem to correlate pretty well with "large" memory footprints, which means cache I think. I just wonder whether the developers take that into account? Who knows what type of Xeon chip with how much cache they were using. And even if it worked then, as the models grow in size, they may fall over an unanticipated ledge. I am not sure anyone is looking into that at the Rosetta level. Even we crunchers don't necessarily know about it, unless we stumble across it. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
However, I just got a large memory work unit that had normal output, and also a low output work unit with normal memory size. So the results are decidedly mixed. I am freeing up another core (running on only 5) to see if it helps the consistency any more. The cache is shared among not only BOINC but other desktop programs, such as a couple of daily backup programs, etc. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The credit system does somewhat reward the fact that some types of work take more resources to complete. The credit award is based on the historical credit claims for a given series of work units. So, presuming other machines are also encountering some degree of memory contention, these historical credit claims will be higher per model than other work unit types and series. This makes it very hard to get consistent credit results, but creates a credit system that automatically reflects the changing impacts of the various types of work. Rosetta Moderator: Mod.Sense |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Thanks, I figured it was somehow dependent on various factors that we don't necessarily see ourselves. But my guess (?) is that the big drops, down to around 160 points, are due to local factors on my machine. But it is still a test, so we will see. |
PappaLitto Send message Joined: 14 Nov 17 Posts: 17 Credit: 28,135,915 RAC: 1,451 |
The credit system does somewhat reward the fact that some types of work take more resources to complete. The credit award is based on the historical credit claims for a given series of work units. So, presuming other machines are also encountering some degree of memory contention, these historical credit claims will be higher per model than other work unit types and series. This makes it very hard to get consistent credit results, but creates a credit system that automatically reflects the changing impacts of the various types of work. Is there anything we as volunteers can optimize to increase throughput and performance? Large memory resources are being mentioned, Does this mean increasing RAM speed helps Rosetta performance? Does most of the data sit in L3 cache? Is it optimal to have only rosetta running or is there not much hit to have a lower memory intensive project running at the same time? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
My opinion is inline with your statement: ...not much hit to have a lower memory intensive project running at the same time There is no benefit to run only R@h work, beyond more CPU hours invested in Rosetta as compared to other projects. If you have available memory on the machine but your settings do not presently allow BOINC Manager to use it, I would revise the BOINC settings to allow use of more of that memory. The default BOINC settings are fairly conservative, assuming that you want things to run without impacting other use of the machine (and assuming that there are other users of the machine). If you know that other activities the machine has to do will still perform adequately, then allowing BOINC to use more of the memory of the machine may help efficiency a bit. Rosetta Moderator: Mod.Sense |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
If you have available memory on the machine but your settings do not presently allow BOINC Manager to use it, I would revise the BOINC settings to allow use of more of that memory. Certainly, but if you run out of main memory, I think you will have big problems. I have seen cases where BOINC has refused to let work units run at all because of a shortage of memory. In the case I am looking at here, I have plenty (24 GB) of main memory, and BOINC is allowed to use 75% while the computer is in use, or 90% while it is idle. That will do for Rosetta, and probably anything else that I know of these days. However, the slowdowns I have seen are (I suspect) due to the work units not fitting entirely into cache memory. In the case of the i7-4771, that is 8 MB, shared among 8 virtual cores. Then, if they don't fit, the work has to run out of main memory, which is much slower than the on-chip cache. That is what I suspect the problem to be, but can't really prove it. In fact, I was still seeing some slowdowns when running Rosetta on only 5 cores, so I have bitten the bullet and turned off hyper-threading entirely. I will run Rosetta on only 3 real cores for a while, to see if it makes any difference. That is probably not the optimum way to run in order to maximize output, but it is only a test at this point. |
Message boards :
Number crunching :
CPU App Performance
©2024 University of Washington
https://www.bakerlab.org