Limited thread usage under Linux?

Message boards : Number crunching : Limited thread usage under Linux?

To post messages, you must log in.

AuthorMessage
Kevin N. Carpenter

Send message
Joined: 6 Apr 20
Posts: 6
Credit: 6,614,362
RAC: 0
Message 93748 - Posted: 7 Apr 2020, 18:39:46 UTC

Hi -

I've recently brought (2) dual xeon Linux based servers online for Rosetta@home. One supports 56 threads (28 cores hyperthreaded), the other supports 32 threads (16 cores hyperthreaded). Both machines have 192GB or more of memory.

I'm getting work on both machine, but not seeing more than 6 threads running on either. Sometimes less.

One Intel I5 (quad core) is only running with 2 threads.

My Xeon single-chip quad core is running full out with 4 threads.

Am I doing something wrong?
ID: 93748 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Davis

Send message
Joined: 24 Mar 20
Posts: 2
Credit: 56,119
RAC: 0
Message 93753 - Posted: 7 Apr 2020, 19:09:01 UTC - in response to Message 93748.  
Last modified: 7 Apr 2020, 19:09:31 UTC

Do you have only have your BOINC computing preferences (in BOINC manager) set to only use 50% of the CPUs on those systems, but 100% on the Xeon quad core? I assume it's the same deal under Linux, anyway.
ID: 93753 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 93756 - Posted: 7 Apr 2020, 19:25:56 UTC - in response to Message 93753.  

Also, do you have "Hyper-Threading Technology" enabled in the BIOS?

And maybe there are limits on the server, though I can run 16 thus far.
https://boinc.berkeley.edu/forum_thread.php?id=12877#90743
ID: 93756 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin N. Carpenter

Send message
Joined: 6 Apr 20
Posts: 6
Credit: 6,614,362
RAC: 0
Message 93762 - Posted: 7 Apr 2020, 20:03:24 UTC - in response to Message 93756.  
Last modified: 7 Apr 2020, 20:17:32 UTC

I will triple check local settings.

Hyperthreading is enabled. But even it was wasn't, I should still be getting 28 and 16 threads, not 6.

Also worth nothing my Windows box is running full out at 24 threads - so I don't think its my project settings. That said, I just updated them to:

ID: 93762 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 93765 - Posted: 7 Apr 2020, 20:09:55 UTC - in response to Message 93748.  

I would also suggest that you examine how much disk BOINC is allowed to use. Something like 1GB per task plus 10GB might be a good starting point for disk space. (can't wait to see your RAC when you get 'em humming!)
Rosetta Moderator: Mod.Sense
ID: 93765 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 392
Credit: 12,098,140
RAC: 5,589
Message 93770 - Posted: 7 Apr 2020, 20:19:25 UTC

You say that you’re receiving work, does that imply that you have WUs ready to run that are not starting or are you running all of the WUs that you’ve received?
ID: 93770 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
entity

Send message
Joined: 8 May 18
Posts: 19
Credit: 5,935,755
RAC: 8,074
Message 93778 - Posted: 7 Apr 2020, 21:22:17 UTC - in response to Message 93770.  

I would stop and restart boinc, then look at the log to see how many CPUs BOINC detected and also look for any messages as to why it is being limited. I would also check to see if there are any local preferences overriding the global preferences from the website.
ID: 93778 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin N. Carpenter

Send message
Joined: 6 Apr 20
Posts: 6
Credit: 6,614,362
RAC: 0
Message 93780 - Posted: 7 Apr 2020, 21:26:39 UTC - in response to Message 93765.  

I would also suggest that you examine how much disk BOINC is allowed to use. Something like 1GB per task plus 10GB might be a good starting point for disk space. (can't wait to see your RAC when you get 'em humming!)


Set to use 100GB max, so should be good. Ah! Wait a minute... on Linux BOINC defaults to running in /var. My /var partitions are typically only about 16GB. Moving to a much larger filesystem to see if that helps.
ID: 93780 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin N. Carpenter

Send message
Joined: 6 Apr 20
Posts: 6
Credit: 6,614,362
RAC: 0
Message 93784 - Posted: 7 Apr 2020, 22:02:39 UTC - in response to Message 93780.  

** SOLVED **

I didn't realize how much disk space Rosetta required - most BOINC applications are pretty light.

Once I moved the BOINC run time directory off of /var/lib/boinc to a larger drive, all processes started up. The kicker was the "use at most 50%" of disk space setting. On a typical 16GB /var partition, that limited it to about 8GB of usage - minus what was already in use. Turns out that was the limitation.

This is truly a heartwarming sight: and

Thanks for helping track this down!

ps. On the 32G thread box, BOINC is using about 29G of disk space (easy to tell, I gave it a dedicated LVM partition). The drive (a mediocre 5400 RPM 4TB Seagate) seems to be handling the I/O load just fine:



cpuserv / # iostat -hm /dev/vg/boinc
Linux 5.4.28-gentoo (cpuserv)   04/07/20        _x86_64_        (32 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.6%   24.8%    1.4%    0.8%    0.0%   72.3%

      tps    MB_read/s    MB_wrtn/s    MB_dscd/s    MB_read    MB_wrtn    MB_dscd Device
    22.60         0.6k         2.4M         0.0k       8.3M      33.2G       0.0k dm-14

ID: 93784 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin N. Carpenter

Send message
Joined: 6 Apr 20
Posts: 6
Credit: 6,614,362
RAC: 0
Message 93885 - Posted: 8 Apr 2020, 17:18:49 UTC - in response to Message 93784.  

Just one followup for those running dozens of threads:

The disk subsystem does matter. On my 32 thread box I routinely saw a half-dozen plus threads NOT at 100%. I moved the BOINC work directly from a single drive to a mdadm 6 member raid 6 cluster and that number dropped to 2 threads - and those threads were running around 98% vs. the 86-95% previously seen.
ID: 93885 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom M

Send message
Joined: 20 Jun 17
Posts: 87
Credit: 15,096,189
RAC: 46,569
Message 94062 - Posted: 10 Apr 2020, 11:14:40 UTC - in response to Message 93784.  

+1
Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel.....
ID: 94062 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Limited thread usage under Linux?



©2024 University of Washington
https://www.bakerlab.org