Rajesh
Rajesh

Reputation: 479

Kubernetes nodeaffinity and podAntiaffinity not able to deploy pods as desired

I am trying to experiment a 2 node cluster (will scale up later once I stabilize) for mongodb. This is using EKS. The 2 nodes are running in two different aws zones. The descriptor is as follows:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: mongod
  labels:
    name: mongo-repl
spec:
  serviceName: mongodb-service
  replicas: 2
  selector:
    matchLabels:
      app: mongod
      role: mongo
      environment: test
  template:
    metadata:
      labels:
        app: mongod
        role: mongo
        environment: test
    spec:
      terminationGracePeriodSeconds: 15
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: failure-domain.beta.kubernetes.io/zone
                operator: In
                values:
                - ap-south-1a
                - ap-south-1b
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - mongod
              - key: role
                operator: In
                values:
                - mongo
              - key: environment
                operator: In
                values:
                - test
            topologyKey: kubernetes.io/hostname
      containers:
        .....

The objective here is to NOT schedule another pod on the same node where already a pod with labels - app=mongod,role=mongo,environment=test is running

When I am deploying the spec, only 1 set of mongo pod is getting created on one node.

ubuntu@ip-192-170-0-18:~$ kubectl describe statefulset mongod
Name:               mongod
Namespace:          default
CreationTimestamp:  Sun, 16 Feb 2020 16:44:16 +0000
Selector:           app=mongod,environment=test,role=mongo
Labels:             name=mongo-repl
Annotations:        <none>
Replicas:           2 desired | 2 total
Update Strategy:    OnDelete
Pods Status:        1 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=mongod
           environment=test
           role=mongo
  Containers:

kubectl describe pod mongod-1

Node:           <none>
Labels:         app=mongod
                controller-revision-hash=mongod-66f7c87bbb
                environment=test
                role=mongo
                statefulset.kubernetes.io/pod-name=mongod-1
Annotations:    kubernetes.io/psp: eks.privileged
Status:         Pending
....
....
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  42s (x14 over 20m)  default-scheduler  0/2 nodes are available: 1 Insufficient pods, 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules.

0/2 nodes are available: 1 Insufficient pods, 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules.

Unable to figure out what is conflicting in the affinity specs. I'll really appreciate some insight here !


Edit on Feb/21 : Added information on new error below

Based on the suggestions, I have now scaled the worker nodes and started receiving more clear error message --

Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 51s (x554 over 13h) default-scheduler 0/2 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules, 1 node(s) had volume node affinity conflict.

So the main issue now (after scaling up worker nodes) turns out to be --

1 node(s) had volume node affinity conflict

Posting below my whole configuration artifacts again:

apiVersion: apps/v1beta1
    kind: StatefulSet
    metadata:
      name: mongod
      labels:
        name: mongo-repl
    spec:
      serviceName: mongodb-service
      replicas: 2
      selector:
        matchLabels:
          app: mongod
          role: mongo
          environment: test
      template:
        metadata:
          labels:
            app: mongod
            role: mongo
            environment: test
        spec:
          terminationGracePeriodSeconds: 15
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: failure-domain.beta.kubernetes.io/zone
                    operator: In
                    values:
                    - ap-south-1a
                    - ap-south-1b
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - mongod
                  - key: role
                    operator: In
                    values:
                    - mongo
                  - key: environment
                    operator: In
                    values:
                    - test
                topologyKey: kubernetes.io/hostname
          containers:
        - name: mongod-container
          .......
      volumes:
        - name: mongo-vol
          persistentVolumeClaim:
            claimName: mongo-pvc

PVC --

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-pvc
spec:
  storageClassName: gp2-multi-az
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi

PV --

apiVersion: "v1"
kind: "PersistentVolume"
metadata:
  name: db-volume-0
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: gp2-multi-az
  awsElasticBlockStore:
    volumeID: vol-06f12b1d6c5c93903
    fsType: ext4
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
        #- key: topology.kubernetes.io/zone
          operator: In
          values:
          - ap-south-1a

apiVersion: "v1"
kind: "PersistentVolume"
metadata:
  name: db-volume-1
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: gp2-multi-az
  awsElasticBlockStore:
    volumeID: vol-090ab264d4747f131
    fsType: ext4
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
        #- key: topology.kubernetes.io/zone
          operator: In
          values:
          - ap-south-1b

Storage Class --

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2-multi-az
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
  type: gp2
  fsType: ext4
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - ap-south-1a
    - ap-south-1b

I don't want to opt for dynamic PVC.

As per @rabello's suggestion adding the below outputs --

kubectl get pods --show-labels
NAME       READY   STATUS    RESTARTS   AGE   LABELS
mongod-0   1/1     Running   0          14h   app=mongod,controller-revision-hash=mongod-5b4699fd85,environment=test,role=mongo,statefulset.kubernetes.io/pod-name=mongod-0
mongod-1   0/1     Pending   0          14h   app=mongod,controller-revision-hash=mongod-5b4699fd85,environment=test,role=mongo,statefulset.kubernetes.io/pod-name=mongod-1

kubectl get nodes --show-labels
NAME                                           STATUS   ROLES    AGE   VERSION              LABELS
ip-192-170-0-8.ap-south-1.compute.internal     Ready    <none>   14h   v1.14.7-eks-1861c5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/nodegroup-image=ami-07fd6cdebfd02ef6e,eks.amazonaws.com/nodegroup=trl_compact_prod_db_node_group,failure-domain.beta.kubernetes.io/region=ap-south-1,failure-domain.beta.kubernetes.io/zone=ap-south-1a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-192-170-0-8.ap-south-1.compute.internal,kubernetes.io/os=linux
ip-192-170-80-14.ap-south-1.compute.internal   Ready    <none>   14h   v1.14.7-eks-1861c5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/nodegroup-image=ami-07fd6cdebfd02ef6e,eks.amazonaws.com/nodegroup=trl_compact_prod_db_node_group,failure-domain.beta.kubernetes.io/region=ap-south-1,failure-domain.beta.kubernetes.io/zone=ap-south-1b,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-192-170-80-14.ap-south-1.compute.internal,kubernetes.io/os=linux

Upvotes: 7

Views: 14131

Answers (1)

Jeremy Cowan
Jeremy Cowan

Reputation: 761

EBS volumes are zonal. They can only be accessed by pods that are located in the same AZ as the volume. Your StatefulSet allows pods to be scheduled in multiple zones (ap-south-1a and ap-south-1b). Given your other constraints, the scheduler may be attempting to schedule a pod on node in different AZ than its volume. I would try confining your StatefulSet to a single AZ or use an operator to install Mongo.

Upvotes: 0

Related Questions