Reputation: 53
I got on my server nodes after a couple months to work on them again and now the slurmd daemon won't start on any of the nodes. My slurmctld is working fine. I have the cgroup.conf file in the slurm directory. Here is the config file:
#CgroupAutomount=yes
ConstrainCores=no
ConstrainRAMSpace=no
I get the same error regardless of whether its v2 or just automount set to yes and plugin commented out.
Here is the error output:
Couldn't find the specified plugin name for cgroup/v2 looking at all files
slurmd[587248]: slurmd: error: cannot find cgroup plugin for cgroup/v2
slurmd[587248]: slurmd: error: cannot create cgroup context for cgroup/v2
slurmd[587248]: slurmd: error: Unable to initialize cgroup plugin
slurmd[587248]: slurmd: error: slurmd initialization failed
I had previously had the cgroup set to v1, but was getting this error:
slurmd[1535]: slurmd: CPU frequency setting not configured for this node
slurmd[1535]: slurmd: error: unable to open '/sys/fs/cgroup/freezer//tasks' for reading : No such file or directory
slurmd[1535]: slurmd: error: cgroup namespace 'freezer' not mounted. aborting
slurmd[1535]: slurmd: error: unable to create freezer cgroup namespace
slurmd: error: Couldn't load specified plugin name for proctrack/cgroup: Plugin init() callback failed
slurmd[1535]: slurmd: error: cannot create proctrack context for proctrack/cgroup
slurmd[1535]: slurmd: error: slurmd initialization failed
So I switched to v2, hence my current error. Any suggestions or help is appreciated.
Update: I changed the config file to
CgroupPlugin=cgroup/v1
CgroupAutomount=yes
ConstrainCores=no
ConstrainRAMSpace=no
CgroupMountpoint=/sys/fs/cgroup
And now the daemon can run/be active, however, there are still some errors related to freezer.
Upvotes: 2
Views: 1110
Reputation: 370
This is becaucse the cgroups_v2.so plugin is not present. Check in /usr/lib64/slurmd/cgroups*
Upvotes: 0