Reputation: 6464
I have a system running Linux kernel 4.19.71 with Intel Xeon Platinum 8160
CPU, it features 24 physical cores and with 2 threads per core it makes 48 logical cores. I'm experimenting with virtualization (qemu
and kvm
) and would like to isolate a set of cores from OS and hypervisor, so that the cores run exclusively application code. So I added isolcpus=
kernel directive:
isolcpus=1-23,25-47
However I'm still seeing that some kernel threads are scheduled on the cores I'm isolating, e.g. :
# ps -A -L -o pid,nlwp,tid,c,psr,comm |sort -n -k 5 | grep 27
148 1 148 0 27 kworker/27:0-mm_percpu_wq
149 1 149 0 27 kworker/27:0H-events_highpri
267 1 267 0 27 kworker/27:1-mm_percpu_wq
799 1 799 0 27 kworker/27:1H-events_highpri
...
#
The 5-th column is the processor (core) id, in this case it is 27, which according to isolcpus=
above should not be disturbed by the kernel, however it runs kworker thread there.
Does it mean there are exceptions and the kernel is still allowed to schedule tasks on the isolated cores, or I'm missing something obvious?
Thanks.
Upvotes: 6
Views: 1432
Reputation: 17
I am also working on this issue and I haven't figured out a way to prevent those kernel threads from being scheduled on the isolated CPU set.
From the documentation of RedHat, it also doesn't seem to be feasible.
Isolating CPUs
You can isolate one or more CPUs from the scheduler with the isolcpus boot parameter. This prevents the scheduler from scheduling any user-space threads on this CPU.
I have been using a combination of isolcpus
and cset shield
in order to prevent the majority of kernel's housekeeping threads being scheduled in my isolated CPUs.
I have used perf sched
in order to record the context switches on my CPUs and perf map
in order to visualize them.
In the first experiment, having used only cset shield
.
$ grep -e '=>' exp_1.sch
*A0 445210.783227 secs A0 => kworker/11:1-ev:165
*. 445210.783275 secs . => swapper:0
*B0 445210.783304 secs B0 => kworker/u24:4-e:130904
*C0 445210.783420 secs C0 => WORKER2:160974
. *D0 C0 445210.783844 secs D0 => kworker/10:0-ev:1672
*E0 . C0 445210.784703 secs E0 => WORKER0:160969
*F0 . C0 445210.789628 secs F0 => kworker/9:1-eve:163
E0 *G0 . 445210.802886 secs G0 => WORKER1:160973
E0 *H0 . 445210.811638 secs H0 => ksoftirqd/10:76
E0 *I0 . 445210.939469 secs I0 => kworker/u24:2-e:158157
*J0 G0 . 445211.527639 secs J0 => ksoftirqd/9:70
E0 G0 *K0 445212.087622 secs K0 => ksoftirqd/11:82
E0 *L0 . 445212.347277 secs L0 => kworker/10:1H-k:277
*M0 I0 C0 445213.321971 secs M0 => kworker/u24:1-e:160121
E0 *N0 . 445214.463593 secs N0 => migration/10:75
*O0 N0 . 445214.463597 secs O0 => migration/9:69
O0 N0 *P0 445214.463598 secs P0 => migration/11:81
*Q0 G0 M0 445225.372366 secs Q0 => kworker/9:1H-kb:330
Here you may see my workload threads (WORKER{0,1,2}
), the kworker threads (kworker/{9,10,11}:
) corresponding to CPUs [9-11], and the rest ksoftirqd/{9,10,11}:
, migration/{9,10,11}:
, kworker/u24
and the "idle" thread swapper
.
In the second experiment, I used cset shield
with isolcpus
.
$ grep -e '=>' exp_2.sch
*A0 1033.342241 secs A0 => WORKER0:3646
A0 *B0 1033.342675 secs B0 => kworker/11:1-ev:165
A0 *. 1033.342694 secs . => swapper:0
A0 *C0 . 1033.343470 secs C0 => WORKER1:3647
A0 C0 *D0 1033.344634 secs D0 => WORKER2:3648
A0 *E0 D0 1033.346306 secs E0 => kworker/10:1-ev:164
*F0 . D0 1033.364736 secs F0 => kworker/9:1-eve:163
A0 *G0 . 1036.433541 secs G0 => migration/10:75
*H0 G0 . 1036.433541 secs H0 => migration/9:69
A0 G0 *I0 1036.433548 secs I0 => migration/11:81
In this case, you see only the WORKER{0,1,2}
, kworker/{9,10,11}
, migration/{9,10,11}
and the swapper
tasks.
Upvotes: 1