Pringles
Pringles

Reputation: 327

sched_setaffinity() for SCHED_DEADLINE

Is there a way I can use deadline scheduling and at the same time set cpu affinity to a process in linux? I'm running 4.16 kernel. Below is my testing code:

#define _GNU_SOURCE
#include "include/my_sched.h"
#include <stdio.h>
#include <time.h>
#include <sys/time.h>

int main() {
    struct sched_attr attr;
    int x = 0;
    int ret;
    unsigned int flags = 0;
    long int tid = gettid();

    printf("deadline thread started [%ld]\n", tid);

    /* Set scheduling properties */
    attr.size = sizeof(attr);
    attr.sched_flags = 0;
    attr.sched_nice = 0;
    attr.sched_priority = 0;

    /* This creates a 100ms/300ms reservation */
    attr.sched_policy = SCHED_DEADLINE;
    attr.sched_runtime = 100 * 1000 * 1000;
    attr.sched_period = attr.sched_deadline = 300 * 1000 * 1000;

    ret = sched_setattr(0, &attr, flags);
    if (ret != 0) {
        done = 0;
        perror("sched_setattr");
        printf("exit!\n");
        exit(-1);
    }

    /* Set CPU affinity */
    cpu_set_t  mask;
    CPU_ZERO(&mask);
    CPU_SET(0, &mask);
    ret = sched_setaffinity(0, sizeof(mask), &mask);

    if (ret != 0) {
        done = 0;
        perror("sched_setaffinity");
        printf("exit!\n");
        exit(-1);
    }


    return 0;
}

Even if I compile the above program and run it in sudo, I got error:

sched_setaffinity: Device or resource busy

If I swap the order of sched_setattr() and sched_setaffinity(), I got a different error:

sched_setattr: Operation not permitted

This happens even if I'm in sudo.

Is there any problem with my code? Why I cannot use sched_setaffinity() and sched_setattr() with deadline scheduling in the same program?

For anyone who is interested to compile and try the program, below is code for my_include.h:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <linux/unistd.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <sys/syscall.h>
#include <pthread.h>

#define gettid() syscall(__NR_gettid)

#define SCHED_DEADLINE  6

/* XXX use the proper syscall numbers */
#ifdef __x86_64__
#define __NR_sched_setattr      314
#define __NR_sched_getattr      315
#endif

#ifdef __i386__
#define __NR_sched_setattr      351
#define __NR_sched_getattr      352
#endif

#ifdef __arm__
#define __NR_sched_setattr      380
#define __NR_sched_getattr      381
#endif

static volatile int done;

struct sched_attr {
    __u32 size;

    __u32 sched_policy;
    __u64 sched_flags;

    /* SCHED_NORMAL, SCHED_BATCH */
    __s32 sched_nice;

    /* SCHED_FIFO, SCHED_RR */
    __u32 sched_priority;

    /* SCHED_DEADLINE (nsec) */
    __u64 sched_runtime;
    __u64 sched_deadline;
    __u64 sched_period;
};

int sched_setattr(pid_t pid,
        const struct sched_attr *attr,
        unsigned int flags)
{
    return syscall(__NR_sched_setattr, pid, attr, flags);
}

int sched_getattr(pid_t pid,
        struct sched_attr *attr,
        unsigned int size,
        unsigned int flags)
{
    return syscall(__NR_sched_getattr, pid, attr, size, flags);
}

What I've done:

I tried to dig into kernel code that does the corresponding sanity check. The following code snippet comes from kernel/sched/core.c:

/*
 * Don't allow tasks with an affinity mask smaller than
 * the entire root_domain to become SCHED_DEADLINE. We
 * will also fail if there's no bandwidth available.
 */
if (!cpumask_subset(span, &p->cpus_allowed) ||
    rq->rd->dl_bw.bw == 0) {
    task_rq_unlock(rq, p, &rf);
    return -EPERM;
}

... 

/*
 * Since bandwidth control happens on root_domain basis,
 * if admission test is enabled, we only admit -deadline
 * tasks allowed to run on all the CPUs in the task's
 * root_domain.
 */
#ifdef CONFIG_SMP
if (task_has_dl_policy(p) && dl_bandwidth_enabled()) {
    rcu_read_lock();
    if (!cpumask_subset(task_rq(p)->rd->span, new_mask)) {
        retval = -EBUSY;
        rcu_read_unlock();
        goto out_free_new_mask;
    }
    rcu_read_unlock();
}

Above two sections correspond those two errors I mentioned earlier. Can anybody explains to me what does the comment mean? I know what a scheduling domain and root domain is, but I don't see how SCHED_DEADLINE task is different than other scheduling policies with respect to scheduling domain? Why it does not make sense to bind a SCHED_DEADLINE task to a specific core?

Upvotes: 2

Views: 2154

Answers (1)

Pringles
Pringles

Reputation: 327

After some diggings, I finally got a good understanding of why setting affinity would cause a problem in SCHED_DEADLINE schedulers.

For anyone who just wants to bind EDF tasks to a subset of cores, you can refer to https://elixir.bootlin.com/linux/v4.17-rc3/source/Documentation/scheduler/sched-deadline.txt#L634, section 5. You can use cpuset along with cgroup utility to assign cores to EDF tasks.

Disallowing sched_setaffinity() for SCHED_DEADLINE tasks is a conservative approach to ensure effective schedulability test in linux kernel. Consider following scenario, where the machine has 9 cores, and task 1 wants to have affinity to core 1,2,4,5, and task 2 wants to have affinity to core 3,4,5,6,7,8.

enter image description here

It turns out that if the affinity masks for different SCHED_DEADLINE tasks overlap for a portion, the schedulability test becomes a NP hard problem. Therefore there's no effective way to decide if the task sets are schedulable. State of art schedulability test for arbitrary core affinity needs exponential computational complexity. Adding functionality to this kind of schedulability test will drastically degrade kernel's performance.

However, I'd argue that this is a conservative approach to achieve schedulability from kernel's point of view. It is because if the linux administrator knows what he is doing, he can intentionally map all the SCHED_DEADLINE tasks to a specific subset of cores. In the following example, we can dedicate core 1,2,4,5 to EDF tasks. The effect is no different than using cpuset approach. Anyway, sched_setaffinity() is able to emulate global, clustered and partitioned job-level fixed priority scheduling.

enter image description here

reference:

Gujarati, A., Cerqueira, F., & Brandenburg, B.B. (2014). Multiprocessor real-time scheduling with arbitrary processor affinities: from practice to theory. Real-Time Systems, 51, 440-483.

Upvotes: 4

Related Questions