Questions and Answers : Unix/Linux : Lower Credit/time for Linux than Windows?
Author | Message |
---|---|
Zxian Send message Joined: 17 May 07 Posts: 18 Credit: 1,173,075 RAC: 0 |
If you have a look at my systems, I've got two computers that are pretty much the same - both based on the 965 chipset, E2160 CPUs. The main difference is that the system running Server 2003 is using 2GB DDR2-800 and the other (which runs FedoraCore7) is 1GB DDR2-677. Windows Machine FC7 Machine I realize that the RAC of the FC7 computer has yet to reach it's maximum, but if you look at the results for the two computers, the Windows system seems to be getting a higher credit per time than the Linux system. Has anyone else had this kind of scenario? Both machines are running 24/7, and I find it hard to believe that the difference in RAM speeds would be the cause of this (I doubt R@H is pushing 6GB/s of RAM access). |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Your Windows machine has half the cache too. Yet get's about 3 credits per 1,000 seconds, compared to 2 credits per 1,000 seconds on your Linux box. Any idea why their benchmarks are not closer? Are you really running at the clock speeds you think they are? I've seen other posts about how Linux handles "nice" tasks. And also that power saving modes can cause a machine to run at a slower, power saving CPU speed when it has no higher priority work to do. This thread discusses the topic. You may want to review it and post there with more questions. Have you been able to determine if the Linux box is showing 100% CPU utilization for Rosetta? Rosetta Moderator: Mod.Sense |
Zxian Send message Joined: 17 May 07 Posts: 18 Credit: 1,173,075 RAC: 0 |
The Windows machine has the same amount of cache - it's the exact same CPU. I built both systems in the past couple of months, and used the E2160 for both. Perhaps the Windows machine is indicating the cache "per CPU", even though the L2 cache on the E2160 is shared. I've disabled SpeedStep on both systems, since they're running R@H anyways, and never get a chance to slow down. Therefore, they're both running at 1.8GHz. This is confirmed by CPU-Z on the Windows machine, and from the information stored in /proc/cpuinfo on the Linux machine. Both top and Process Monitor indicate that there are two processes (Rosetta_something) that each use 100% of a core. |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
The Windows machine has the same amount of cache - it's the exact same CPU. I built both systems in the past couple of months, and used the E2160 for both. Perhaps the Windows machine is indicating the cache "per CPU", even though the L2 cache on the E2160 is shared. SpeedStep, PowerNow, or cpu frequency scaling could possibly (depends on settings) run CPU at a lower speed than full in Linux. Since you just disabled it on both systems, let's see if the numbers change after a few WU are completed on each machine. I'm curious to know if there's a discrepancy between OS. |
Zxian Send message Joined: 17 May 07 Posts: 18 Credit: 1,173,075 RAC: 0 |
I can say with certainty that the CPU on the Linux machine is indeed working at full capacity. That isn't the issue at hand. It's just strange for me to see the granted credit be this low for this system. For my windows system, it seems as though 3 hours of work gives about 35-40 credit on average, while the Linux system is given 20-30. Does anyone know if there are significant compiler optimizations that are present in the Windows executable as opposed to the Linux binary? |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
In the same vain, I'd like to know why *every* work unit has more claimed credit than granted credit. I think it's CPU-specific, but I'm not sure. Running AMD Opteron 248 HE. |
Zxian Send message Joined: 17 May 07 Posts: 18 Credit: 1,173,075 RAC: 0 |
That could have to do with the lower measured benchmarks. If they're reading lower than normal, and the WU comes out to an average amount, the system probably thinks that it's done a "good" job, and will report a higher estimated credit (since it did well). |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
That could have to do with the lower measured benchmarks. If they're reading lower than normal, and the WU comes out to an average amount, the system probably thinks that it's done a "good" job, and will report a higher estimated credit (since it did well). I'll put my Opteron's against any 2.2 GHz system and keep up with the best crunchers in terms of speed. It's mildly disappointing when my benchmarks rank lower than my CPU is capable. Even if future BOINC versions are better, I don't want to upgrade BOINC until it's proven to be stable for me. Even the version I compiled myself isn't stable when started by the manager. BOINC 5.8.16 32-bit Measured floating point speed 1847.61 million ops/sec Measured integer speed 3032.67 million ops/sec BOINC 5.10.20 64-bit (self-compiled) Measured floating point speed 2137.64 million ops/sec Measured integer speed 5680.08 million ops/sec Both BOINC programs exhibit the "granted lower credit than claimed" concern on the same system. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
The credit claim is based upon the machine's benchmarks. The credit granted is based upon the average claims of all the other people working on the same type of task. Since benchmarks are not designed to represent the type of work Rosetta performs, they do not accurately predict the amount of work the machine will be able to produce. Since your credit claim is higher then credit granted, it is an indication that the benchmarks predicted you could complete more work then Rosetta actually found to be the case. If you take two machines and one has a benchmark of 1000 million floating point operations per second, and another has a benchmark of 2000, you might reasonably expect the second machine to get twice as much credit per hour of crunching. But it will prove to not be that simple. What if the second machine has less memory? What if it's L2 cache is smaller? You see there's more to getting work done on a machine then just the Ghz of the CPU. This is why Rosetta adopted the credit system it has. It grants credit based on the actual work returned to the project. Windows users also tend to find AMD CPUs granted less then credit claimed as well, so it is not specific to Linux. And, in the end, it tends to be due to a smaller L2 cache then similar Ghz CPUs from other vendors. Rosetta Moderator: Mod.Sense |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
The credit system seems to work fine; I'm trying to understand it. I hope the AMD vs. Intel credit differences are small or zero (even if you compare same L2 cache sizes). If having a big L2 is what helps Rosetta, then that's fine. My system has 64KB/64KB instruction/data L1 cache and 1024KB of L2 cache per CPU. Even the top Intel computers: https://boinc.bakerlab.org/rosetta/top_hosts.php ...don't have more than 1024KB of L2 cache per core. If AMD sucks compared to Intel, then that's fine by me, although I would have to question the Rosetta application code. Both of his systems linked in the original post, are Intel CPU computers. So, explaining the discrepancy with cache size differences between Intel & AMD doesn't measure up. The issue brought up at the beginning of this thread was comparing application performance in Windows vs Linux. Zxian and I are still curious about the discrepancy. Rosetta has a reputation of neglecting their Unix application and working a lot more on the Windows one. So, I just wanted to: 1) acknowledge Zxian's findings to the developers, and 2) make sure there's nothing specific about our systems that may be slowing Rosetta application. |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
If you have a look at my systems, I've got two computers that are pretty much the same - both based on the 965 chipset, E2160 CPUs. The main difference is that the system running Server 2003 is using 2GB DDR2-800 and the other (which runs FedoraCore7) is 1GB DDR2-677. Zxian, What speed CPU is the FC7 machine? It's not listed on your computer info. Also, since the FC7 machine only has half the memory, have you verified that each machine is not competing with other tasks it must do? That can make a big difference in the performance of BOINC apps. |
Zxian Send message Joined: 17 May 07 Posts: 18 Credit: 1,173,075 RAC: 0 |
The CPU speed of the FC7 machine is at 1.8GHz now. I was digging through various configuration files, and found out that dispite everything being "as it should", the system was still only sitting at 1.2GHz when Rosetta was running. The GNOME CPU frequency monitor showed one speed, while /proc/cpuinfo showed another - very strange. It's running at 1.8GHz now, so we'll see if the credit scores go up accordingly. *crosses fingers* |
DJStarfox Send message Joined: 19 Jul 07 Posts: 145 Credit: 1,250,162 RAC: 0 |
The CPU speed of the FC7 machine is at 1.8GHz now. I was digging through various configuration files, and found out that dispite everything being "as it should", the system was still only sitting at 1.2GHz when Rosetta was running. The GNOME CPU frequency monitor showed one speed, while /proc/cpuinfo showed another - very strange. Shoot! You may not even have an issue! Make sure your /etc/sysconfig/cpuspeed file has the line: IGNORE_NICE=0 I'll check back in a few days to see if that helped you. |
Zxian Send message Joined: 17 May 07 Posts: 18 Credit: 1,173,075 RAC: 0 |
Yup - that line was there. I had checked that last week when I first realized this problem. |
Akshay Naik Send message Joined: 15 Dec 07 Posts: 1 Credit: 511 RAC: 0 |
If you have a look at my systems, I've got two computers that are pretty much the same - both based on the 965 chipset, E2160 CPUs. The main difference is that the system running Server 2003 is using 2GB DDR2-800 and the other (which runs FedoraCore7) is 1GB DDR2-677. same with SETI@home.I pasted the following from W.C.G. Points are calculated in a two-step process which attempts to give a consistent number of points for similar amounts of research computation. First, the computational power/speed of the computer is determined by periodically running a benchmark calculation. Then, based on the central processing unit (CPU) time spent computing the research result for a work unit, the benchmark result is used to convert the time spent on a work unit into points. This adjusts the point value so that a slow computer or a fast computer would produce about the same number of points for calculating the research result for the same work unit. This value is the number of point credits "claimed" by the client. |
Questions and Answers :
Unix/Linux :
Lower Credit/time for Linux than Windows?
©2024 University of Washington
https://www.bakerlab.org