Adelin
Adelin

Reputation: 18931

How to define resource limits and calculate consumption when deploying applications in Kubernetes?

I am trying to deploy a simple zookeeper ensemble following the tutorial from the official Kubernetes website. The tutorial states that I need

a cluster with at least four nodes and each node requires at least 2 CPUs and 4 GiB of memory.

I ignored this fact and created a cluster with 3 nodes of n1-standard-1 (1 vCPU, 3.73 GB Memory) When I tried to apply .yaml file

apiVersion: v1
kind: Service
metadata:
  name: zk-hs
  labels:
    app: zk
spec:
  ports:
    - port: 2888
      name: server
    - port: 3888
      name: leader-election
  clusterIP: None
  selector:
    app: zk
---
apiVersion: v1
kind: Service
metadata:
  name: zk-cs
  labels:
    app: zk
spec:
  ports:
    - port: 2181
      name: client
  selector:
    app: zk
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: zk-pdb
spec:
  selector:
    matchLabels:
      app: zk
  maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zk
spec:
  selector:
    matchLabels:
      app: zk
  serviceName: zk-hs
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: OrderedReady
  template:
    metadata:
      labels:
        app: zk
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                      - zk
              topologyKey: "kubernetes.io/hostname"
      containers:
        - name: kubernetes-zookeeper
          imagePullPolicy: Always
          image: "k8s.gcr.io/kubernetes-zookeeper:1.0-3.4.10"
          resources:
            requests:
              memory: "1Gi"
              cpu: "0.5"
          ports:
            - containerPort: 2181
              name: client
            - containerPort: 2888
              name: server
            - containerPort: 3888
              name: leader-election
          command:
            - sh
            - -c
            - "start-zookeeper \
          --servers=3 \
          --data_dir=/var/lib/zookeeper/data \
          --data_log_dir=/var/lib/zookeeper/data/log \
          --conf_dir=/opt/zookeeper/conf \
          --client_port=2181 \
          --election_port=3888 \
          --server_port=2888 \
          --tick_time=2000 \
          --init_limit=10 \
          --sync_limit=5 \
          --heap=512M \
          --max_client_cnxns=60 \
          --snap_retain_count=3 \
          --purge_interval=12 \
          --max_session_timeout=40000 \
          --min_session_timeout=4000 \
          --log_level=INFO"
          readinessProbe:
            exec:
              command:
                - sh
                - -c
                - "zookeeper-ready 2181"
            initialDelaySeconds: 10
            timeoutSeconds: 5
          livenessProbe:
            exec:
              command:
                - sh
                - -c
                - "zookeeper-ready 2181"
            initialDelaySeconds: 10
            timeoutSeconds: 5
          volumeMounts:
            - name: datadir
              mountPath: /var/lib/zookeeper
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
  volumeClaimTemplates:
    - metadata:
        name: datadir
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 10Gi

And of course, I got the error PodUnschedulable

In this file, I cloud not find anything that says that I need a cluster of 4 nodes and with 2 CPUs and 4G Ram. So:

Upvotes: 2

Views: 772

Answers (2)

Abdennour TOUMI
Abdennour TOUMI

Reputation: 93163

By default, a kubernetes nodes will not come empty. Instead, it has running processes before even running your apps workload:

  • kubelet is running ( in each node)
  • kube-proxy is running as daemonset ( in each node)
  • container-runtime (Docker) is running in each node
  • other daemonset can be running (like aws-node DS in the case of EKS..).

We are here discussing Worker Nodes not Masters.

So imagine all that, you will end up by choosing a respectful resources for each Node.

Not all Nodes must be with the same size. However you decide which size you need according to the type of your apps :

  • If your apps eats memory more than CPUs (Like Java Apps), you will need to choose Node of [2CPU, 8GB] is better than [4CPUs, 8GB].

  • If your apps eats CPUs more than memory (Like ML workload), you will need to choose the opposite; computing-optimized instances.

  • The golden rule 🏆 is to calculate the whole capacity is better than looking into the individual capacity for each node.

This means 3 large nodes might be better than 4 medium nodes in term of cost but also in term of the best usage of capacity.

enter image description here

As conclusion, Node resource must be:

  • no less than 2 CPUs
  • no less than 4GB memory

Otherwise, you should expect capacity issues.


Now, we reach the half of the answer: Identify the capacity of the cluster.

The second half is about answering how to assign resources for the each app (pod).

This is fall into another question; How much your app consume ?

To answer this question, you need to monitor your app with APM tools like Prometheus + Grafana.

Once you get insight about the avergae of consumption, it's the time to set resources limits for your app (its pods).

Limits might throttle the app, that's why, you need to set up alongside other things for horizontal auto-scaling:

  • Run Pods inside Deployment to manage replicas and deployment.
  • HPA or Horizontal Pod autoscaler: which monitors the pods of the deployment , then scale out/in according to thresholds (CPU, memory)

As conclusion for this part, we can say:

- Measure : start measure to identify resources.limits and resources.requests.

- Measure: measure after running the app to identify again the needed resources.

- Measure: Keep measure

Upvotes: 3

Arghya Sadhu
Arghya Sadhu

Reputation: 44559

resources section in the deployment yaml defines the resource requirement of a container in the pod.

resources:
  requests:
    memory: "1Gi"
    cpu: "0.5"

requests means a node need to have more than 1GB of memory and 0.5 CPU available for one of the replica pod to be schedulable on that node.

There is another concept of limits which defines maximum resource a pod is allowed to consume before it get's evicted from the node.

resources:
  requests:
    memory: "1Gi"
    cpu: "0.5"
  limits:
    memory: "2Gi"
    cpu: "2"

Although you have 3 nodes the master node is not schedule by default.

You can understand the resource availability of a node by kubectl describe nodename and checking the Capacity and Allocatable section.

Capacity:
  cpu:                4
  ephemeral-storage:  8065444Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16424256Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  7433113179
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16321856Ki
  pods:               110

Regarding calculating resource requirement(requests and limits) of a pod there is no silver bullet. It depends on the application and should be determined and tuned by profiling it. But it's recommended best practice to define requests and limits when deploying a pod.

Upvotes: 1

Related Questions