I was using the numactl, with --physcpubind option. manual says:
--physcpubind=cpus, -C cpus
Only execute process on cpus. Etc...
Let's say I have NUMA system with 3 NUMA nodes, where each of them has 4 cores. NUMA node 0 has 0, 1, 2, 3 as core numbers. NUMA node 1 has 4,5,6,7, and NUMA node 2 has 8,9,10,11. My question is let's say I run the program as follows:
export OMP_NUM_THREADS=6
numactl --physcpubind=0,1,4,5,8,9 ./program
i.e. I'll be running my program with 6 threads and I am requesting them to be on CPU cores 0,1,4,5,8,9. For example, if at some point during the program threads 0-5 are assigned CPU cores 0,1,4,5,8,9 (setup1). Is it possible that at some other point during the program execution thread 0 may be running on CPU core 9 for example, and so forth? I.e. will there be thread migration between CPU cores? Or the threads uniquely get bound to CPU cores (as in setup1)? Thanks.
physcpubind option of numactl should be an interface to sched_setaffinity system call, which modifies cpuset (set of allowed CPU) of the process at moment of process starting. Each thread will have own cpuset, but all threads will inherit their cpuset value from parent process.
So, threads are allowed to run on any CPU from the cpuset, migration is allowed between any cpu from cpuset.
Any thread can call sched_setaffinity or pthread_setaffinity_np (linux-specific variant of affinity changing for single thread) to narrow or even to expand its cpuset.
If you want bind threads to CPUs, use sched_setaffinity or pthread_setaffinity_np directly in every thread, or in case of OpenMP set affinity via OMP library: OpenMP and CPU affinity e.g. with command (OpenMP 3.1+)
export OMP_PROC_BIND=true
I guess that OMP library will select CPUs in round-robin manner from cpuset of process at time of omp library initialization.
For older version of libgomp - OMP support library used by GCC - you can pass allowed set of CPU with command:
export GOMP_CPU_AFFINITY=0-1,4-5,8-9
PS: to check your threads placement you can start top
and enable "Last CPU used" field with f
j
keys and turn on thread display with H
.