Reputation: 2987
Is there any way to configure nodeSelector
at the namespace level?
I want to run a workload only on certain nodes for this namespace.
Upvotes: 35
Views: 53394
Reputation: 169
To dedicate nodes to only host resources belonging to a namespace, you also have to prevent the scheduling of other resources over those nodes.
It can be achieved by a combination of podSelector
and a taint
, injected via the admission controller when you create resources in the namespace. In this way, you don't have to manually label and add tolerations to each resource but it is sufficient to create them in the namespace.
Properties objectives:
Add a taint to the nodes you want to dedicate to the namespace:
kubectl taint nodes project.example.com/GPUsNodePool=true:NoSchedule -l=nodesWithGPU=true
This example adds the taint to the nodes that already have the label
nodesWithGPU=true
. You can taint nodes also individually by name:kubectl taint node my-node-name project.example.com/GPUsNodePool=true:NoSchedule
Add a label:
kubectl label nodes project.example.com/GPUsNodePool=true -l=nodesWithGPU=true
The same is done if, for example, you use Terraform and AKS. The node pool configuration:
resource "azurerm_kubernetes_cluster_node_pool" "GPUs_node_pool" {
name = "gpusnp"
kubernetes_cluster_id = azurerm_kubernetes_cluster.clustern_name.id
vm_size = "Standard_NC12" # https://azureprice.net/vm/Standard_NC12
node_taints = [
"project.example.com/GPUsNodePool=true:NoSchedule"
]
node_labels = {
"project.example.com/GPUsNodePool" = "true"
}
node_count = 2
}
Create then the namespace with instructions for the admission controller:
apiVersion: v1
kind: Namespace
metadata:
name: gpu-namespace
annotations:
scheduler.alpha.kubernetes.io/node-selector: "project.example.com/GPUsNodePool=true" # poorly documented: format has to be of "selector-label=label-val"
scheduler.alpha.kubernetes.io/defaultTolerations: '[{"operator": "Equal", "value": "true", "effect": "NoSchedule", "key": "project.example.com/GPUsNodePool"}]'
project.example.com/description: 'This namespace is dedicated only to resources that need a GPU.'
Done! Create resources in the namespace and the admission controller together with the scheduler will do the rest.
Create a sample pod with no label or toleration but into the namespace:
kubectl run test-dedicated-ns --image=nginx --namespace=gpu-namespace
# get nodes and nodes
kubectl get po -n gpu-namespace
# get node name
kubectl get po test-dedicated-ns -n gpu-namespace -o jsonpath='{.spec.nodeName}'
# check running pods on a node
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=<node>
Upvotes: 5
Reputation: 2987
To achieve this you can use PodNodeSelector
admission controller.
First, you need to enable it in your kubernetes-apiserver:
/etc/kubernetes/manifests/kube-apiserver.yaml
:
--enable-admission-plugins=
PodNodeSelector
parameterNow, you can specify scheduler.alpha.kubernetes.io/node-selector
option in annotations for your namespace, example:
apiVersion: v1
kind: Namespace
metadata:
name: your-namespace
annotations:
scheduler.alpha.kubernetes.io/node-selector: env=test
spec: {}
status: {}
After these steps, all the pods created in this namespace will have this section automatically added:
nodeSelector
env: test
More information about the PodNodeSelector
you can find in the official Kubernetes documentation:
https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#podnodeselector
If you deployed your cluster using kubeadm and if you want to make this configuration persistent, you have to update your kubeadm config file:
kubectl edit cm -n kube-system kubeadm-config
specify extraArgs
with custom values under apiServer
section:
apiServer:
extraArgs:
enable-admission-plugins: NodeRestriction,PodNodeSelector
then update your kube-apiserver static manifest on all control-plane nodes:
# Kubernetes 1.22 and forward:
kubectl get configmap -n kube-system kubeadm-config -o=jsonpath="{.data}" > kubeadm-config.yaml
# Before Kubernetes 1.22:
# "kubeadmin config view" was deprecated in 1.19 and removed in 1.22
# Reference: https://github.com/kubernetes/kubeadm/issues/2203
kubeadm config view > kubeadm-config.yaml
# Update the manifest with the file generated by any of the above lines
kubeadm init phase control-plane apiserver --config kubeadm-config.yaml
You can just use kube_apiserver_enable_admission_plugins
variable for your api-server configuration variables:
kube_apiserver_enable_admission_plugins:
- PodNodeSelector
Upvotes: 55
Reputation: 6221
I totally agree with the @kvaps answer but something is missing : it is necessary to add a label in your node :
kubectl label node <yournode> env=test
Like that, the pod created in the namespace with scheduler.alpha.kubernetes.io/node-selector: env=test
will be schedulable only on node with env=test
label
Upvotes: 5