kvaps
kvaps

Reputation: 2987

How to assign a namespace to certain nodes?

Is there any way to configure nodeSelector at the namespace level?

I want to run a workload only on certain nodes for this namespace.

Upvotes: 35

Views: 53394

Answers (3)

nyxgear
nyxgear

Reputation: 169

To dedicate nodes to only host resources belonging to a namespace, you also have to prevent the scheduling of other resources over those nodes.

It can be achieved by a combination of podSelector and a taint, injected via the admission controller when you create resources in the namespace. In this way, you don't have to manually label and add tolerations to each resource but it is sufficient to create them in the namespace.

Properties objectives:

  • the podSelector forces scheduling of resources only on the selected nodes
  • the taint denies scheduling of any resource not in the namespace on the selected nodes

Configuration of nodes/node pool

Add a taint to the nodes you want to dedicate to the namespace:

kubectl taint nodes project.example.com/GPUsNodePool=true:NoSchedule -l=nodesWithGPU=true

This example adds the taint to the nodes that already have the label nodesWithGPU=true. You can taint nodes also individually by name: kubectl taint node my-node-name project.example.com/GPUsNodePool=true:NoSchedule

Add a label:

kubectl label nodes project.example.com/GPUsNodePool=true -l=nodesWithGPU=true

The same is done if, for example, you use Terraform and AKS. The node pool configuration:

resource "azurerm_kubernetes_cluster_node_pool" "GPUs_node_pool" {
   name                  = "gpusnp"
   kubernetes_cluster_id = azurerm_kubernetes_cluster.clustern_name.id
   vm_size               = "Standard_NC12" # https://azureprice.net/vm/Standard_NC12
   node_taints = [
       "project.example.com/GPUsNodePool=true:NoSchedule"
   ]
   node_labels = {
       "project.example.com/GPUsNodePool" = "true"
   }
   node_count = 2
}

Namespace creation

Create then the namespace with instructions for the admission controller:

apiVersion: v1
kind: Namespace
metadata:
  name: gpu-namespace
  annotations:
    scheduler.alpha.kubernetes.io/node-selector: "project.example.com/GPUsNodePool=true"  # poorly documented: format has to be of "selector-label=label-val"
    scheduler.alpha.kubernetes.io/defaultTolerations: '[{"operator": "Equal", "value": "true", "effect": "NoSchedule", "key": "project.example.com/GPUsNodePool"}]'
    project.example.com/description: 'This namespace is dedicated only to resources that need a GPU.' 

Done! Create resources in the namespace and the admission controller together with the scheduler will do the rest.


Testing

Create a sample pod with no label or toleration but into the namespace:

kubectl run test-dedicated-ns --image=nginx --namespace=gpu-namespace

# get nodes and nodes
kubectl get po -n gpu-namespace

# get node name 
kubectl get po test-dedicated-ns -n gpu-namespace -o jsonpath='{.spec.nodeName}'

# check running pods on a node
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=<node>

Upvotes: 5

kvaps
kvaps

Reputation: 2987

To achieve this you can use PodNodeSelector admission controller.

First, you need to enable it in your kubernetes-apiserver:

  • Edit /etc/kubernetes/manifests/kube-apiserver.yaml:
    • find --enable-admission-plugins=
    • add PodNodeSelector parameter

Now, you can specify scheduler.alpha.kubernetes.io/node-selector option in annotations for your namespace, example:

apiVersion: v1
kind: Namespace
metadata:
 name: your-namespace
 annotations:
   scheduler.alpha.kubernetes.io/node-selector: env=test
spec: {}
status: {}

After these steps, all the pods created in this namespace will have this section automatically added:

nodeSelector
  env: test

More information about the PodNodeSelector you can find in the official Kubernetes documentation: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#podnodeselector


kubeadm users

If you deployed your cluster using kubeadm and if you want to make this configuration persistent, you have to update your kubeadm config file:

kubectl edit cm -n kube-system kubeadm-config

specify extraArgs with custom values under apiServer section:

apiServer: 
  extraArgs: 
    enable-admission-plugins: NodeRestriction,PodNodeSelector

then update your kube-apiserver static manifest on all control-plane nodes:

# Kubernetes 1.22 and forward:
kubectl get configmap -n kube-system kubeadm-config -o=jsonpath="{.data}" > kubeadm-config.yaml

# Before Kubernetes 1.22:
# "kubeadmin config view" was deprecated in 1.19 and removed in 1.22
# Reference: https://github.com/kubernetes/kubeadm/issues/2203
kubeadm config view > kubeadm-config.yaml

# Update the manifest with the file generated by any of the above lines 
kubeadm init phase control-plane apiserver --config kubeadm-config.yaml

kubespray users

You can just use kube_apiserver_enable_admission_plugins variable for your api-server configuration variables:

 kube_apiserver_enable_admission_plugins:
   - PodNodeSelector

Upvotes: 55

Nicolas Pepinster
Nicolas Pepinster

Reputation: 6221

I totally agree with the @kvaps answer but something is missing : it is necessary to add a label in your node :

kubectl label node <yournode> env=test

Like that, the pod created in the namespace with scheduler.alpha.kubernetes.io/node-selector: env=test will be schedulable only on node with env=test label

Upvotes: 5

Related Questions