wangt13
wangt13

Reputation: 1215

Ratio of sched_rt_runtime_us to sched_rt_period_us is NOT shown in 'top' in Linux

I am checking the impact of Linux's sched_rt_runtime_us.
My understanding of the Linux RT scheduling is sched_rt_period_us defines scheduling period of RT process, and sched_rt_runtime_us defines how much the RT process can run within that period.

In my Linux-4.18.20, the kernel.sched_rt_period_us = 1000000, kernel.sched_rt_runtime_us = 950000, so in each second, 95% time is used by RT process, 5% is for SCHED_OTHER processes.

By changing the kernel.sched_rt_runtime_us, the CPU usage of RT process shown in top should be proportional with sched_rt_runtime_us/sched_rt_period_us.
But my testing does NOT get the expected results, and what I got is as follows,

                                               %CPU
kernel.sched_rt_runtime_us = 50000  
2564 root      rt   0    4516    748    684 R  19.9  0.0   0:37.82 testsched_top  
kernel.sched_rt_runtime_us = 100000  
2564 root      rt   0    4516    748    684 R  40.5  0.0   0:23.16 testsched_top  
kernel.sched_rt_runtime_us = 150000  
2564 root      rt   0    4516    748    684 R  60.1  0.0   0:53.29 testsched_top
kernel.sched_rt_runtime_us = 200000  
2564 root      rt   0    4516    748    684 R  80.1  0.0   1:24.96 testsched_top

The testsched_top is a SCHED_FIFO process with priority 99, and it is running in an isolated CPU.
The cgroup is configured in grub.cfg as cgroup_disable=cpuset,cpu,cpuacct to disable CPU related stuff.

I don't know why this happens, is there anything missing or wrong in my testing and understanding of Linux SCHED_FIFO scheduling?

N.B.: I am running this in Ubuntu VM, which is configured with 8 vCPUs, in which 4-7 are isolated to run RT processes. The host is Intel X86_64 with 6Cores (12 Threads), and there is NO other VMs running in the host. The above testsched_top was copied from https://viviendolared.blogspot.com/2017/03/death-by-real-time-scheduling.html?m=0, it sets priority 99 for SCHED_FIFO and loops indefinitely in one isolated CPU. I checked that isolated CPU usage, and got above results. –

Upvotes: 0

Views: 2144

Answers (1)

wangt13
wangt13

Reputation: 1215

I think I got the answer, and thank Rachid for the question.

In short, the kernel.sched_rt_period_us is the sum of RT time slice in a group of CPUs.
For example, in my 8vCPU VM configuration, CPU4-7 are isolated for running specific processes. So the kernel.sched_rt_period_us should be evenly divided among these 4 isolated CPUs, which means kernel.sched_rt_period_us/4 = 250000 is 100% CPU quota for each CPU in the isolated group. Setting kernel.sched_rt_period_us to 250000 makes the SCHED_FIFO process take all of the CPU. Accordingly, 25000 means 10% CPU usage for the CPU, 50000 means 20%, etc.

This is validated when CPU6 and CPU7 are isolated, in this case, 500000 can make the CPU to be 100% used by SCHED_FIFO process, 250000 makes 50% CPU usage.

Since these two kernel parameters are global ones, which means if the SCHED_FIFO process is put into the CPU0-5, 1000000/6 = 166000 should be the 100% quota for each CPU, 83000 makes 50% CPU usage, I also validated this.

Here is the snapshot of top,

%Cpu4  : 49.7 us,  0.0 sy,  0.0 ni, 50.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16422956 total, 14630144 free,   964880 used,   827932 buff/cache
KiB Swap:  1557568 total,  1557568 free,        0 used. 15245156 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3748 root      rt   0    4516    764    700 R  49.5  0.0  30:21.03 testsched_top

Upvotes: 1

Related Questions