Reputation: 2649
I've been doing a lot of digging on Kubernetes, and I'm liking what I see a lot! One thing I've been unable to get a clear idea about is what the exact distinctions are between the Deployment and StatefulSet resources and in which scenarios would you use each (or is one generally preferred over the other).
Upvotes: 214
Views: 135305
Reputation: 10358
Comparing StatefulSets with ReplicaSets
Feature |
StatefulSets |
Deployment |
---|---|---|
State | Statefull | Stateless |
Definition | Stateful app: Stateful applications typically involve some database, such as Cassandra, MongoDB, MessageQueue like Kafka, RabbitMQ or MySQL, and processes a read and/or write to it. | Usually, frontend components have completely different scaling requirements than the backends, so we tend to scale them individually. Not to mention the fact that backends such as databases are usually much harder to scale compared to (stateless) frontend web servers. Yes, the term “stateless” means that no past data nor state is stored or needs to be persistent when a new container is created |
Behaviour | When a stateful pod instance dies (or the node it’s running on fails), the pod instance needs to be resurrected on another node, new instance get the same name, network identity, and state as the one it’s replacing. | Pod replicas managed by a Deployment; they’re mostly stateless, they can be replaced with a completely new pod replica at any time. |
Pod Mechanism | Pods created by the StatefulSet aren’t exact replicas of each other. Each can have its own set of volumes—in other words, storage (and thus persistent state)—which differentiates it from its peers. | When a Deployment replaces a pod, the new pod is a completely new pod with a new hostname and IP |
Upvotes: 3
Reputation: 181
What is StatefulSet?
It's a Kubernetes component that is used specifically for stateful applications.
What is a Stateful application?
Any application that stores data to keep track of its state. Or we can say that the applications that track state by saving information in some storage.
Examples of Stateful applications are all kinds of Databases
Deployment == Stateless
What is a Stateless application?
Those applications that do not need to keep records of previous request and interaction is handled as completely new and isolated based on information that comes with it.
Deployment of Stateful and Stateless application Stateful application is used to deploy using Statefulset component of Kubernetes.
Stateless application is used to deploy using Deployment component Kubernetes.
Just like deployment statefulset makes it possible to replicate application pods or to run multiple replicas of it.
You can also configure storage with both of them equally in the same way.
Let's discuss an example
Let's say we have one MongoDB pod that handles requests from the NodeJs application pod which is deployed using deployment. Let's say we have scaled NodeJs application pod from 1 to 3 so they can handle more client requests and in parallel, you scale MongoDB pod so that they can handle more NodeJs request.
Scaling is your NodeJs application is pretty straightforward, pods will be identical and interchangeable so that to scale up deployment is pretty easy. The deployment will create the pods in any random order and they will get random hashes at the end of pod name NodeApp-f5cdee
, NodeApp-fasx34
, NodeApp-ax7jds
.
The deployment will get one svc
which helps to load balance to any pod of any request.
When you delete or scale down the deployment they will delete them in any random order at the same time.
If we talk about MongoDB pod replicas that were deployed using statefulset can not be created and deleted at the same time in any order and con not be randomly addressed. The reason behind this is replica pods of statefulset are not identical because they each have their own additional identity of the pods.
Note-: Giving each pod its own required identity makes the difference between stateful and deployment
Statefulset maintains a sticky identity for each pod so they are created from the same specification but are not interchangeable!
It has a persistent identifier across any re-scheduling which means that when a pod dies it is replaced by a new pod and keeps the same identity.
But why is this identity nescessary ?
If we talk about a single MongoDB pod that used to be both reading and writing the data but if you add the second pod of MongoDB this can not act as the same way because if we allow instances of MongoDB to change the data that will end up with data inconsistency.
So instead there is a mechanism that decides that only the pod is allowed to write or change the data which is shared for multiple MongoDB instances for reading so the pod which allows changing the data is called master and others are called slave.
Note-: Master and slaves don't use the same physical storage even though they use the same data.
In stateful every pod has its own identifier and gets a fixed order name but not the same in the case for deployment.
If we create a satefulset replicas of 3 then it will create like MongoDB-0
, MongoDB-1
, MongoDB-2
here the first one is master in next are slaves.
Note-: The statefulset will not create the next pod in the replica of the previous pod is not already running and up and the same order is for deletion but in reverse order.
So finally we can say that the Statefulset application has 2 characters
When pods restart the IP address will change but the name and endpoints still the same.
Upvotes: 3
Reputation: 3579
TL;DR
Deployment is a resource to deploy a stateless application, if using a PVC, all replicas will be using the same Volume and none of it will have its own state.
Statefulsets is used for Stateful applications, each replica of the pod will have its own state, and will be using its own Volume.
DaemonSet is a controller similar to ReplicaSet that ensures that the pod runs on all the nodes of the cluster. If a node is added/removed from a cluster, DaemonSet automatically adds/deletes the pod.
I have written about the detailed differences between Deployments, StatefulSets & Daemonsets, and how to deploy a sample application using these Resources K8s: Deployments vs StatefulSets vs DaemonSets.
Upvotes: 140
Reputation: 916
Use 'StatefulSet' with Stateful Distributed Applications, that require each node to have a persistent state. StatefulSet provides the ability to configure an arbitrary number of nodes, for a stateful application/component, through a configuration (replicas = N).
There are two kinds of stateful distributed applications: Master-Master and Master-Slave. All nodes in a Master-Master configuration and Slave nodes in a Master-Slave configuration can make use of a StatefulSet.
Examples:
Master-Slave -> Datanodes (slaves) in a Hadoop cluster
Master-Master -> Database nodes (master-master) in a Cassandra cluster
Each Pod (replica/node) in a StatefulSet has a Unique and Stable network identity. For example in a Cassandra StatefulSet with name as 'cassandra' and number of replica nodes as N, each Cassandra pod (node) has:
Refer: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
'Deployment' on the other hand is suitable for stateless applications/services where the nodes do not require any special identity. A load balancer can reach any node that it chooses. All nodes are equal. A Deployment is useful for creating any number of arbitrary nodes, through a configuration (replicas = N).
Upvotes: 22
Reputation: 121
The difference between StatefulSet and deployment
StatefulSet is equivalent to a special deployment. Each pod in StatefulSet has a stable, unique network identifier that can be used to discover other members in the cluster. If the name of StatefulSet is Kafka, then the first pod is called Kafka-0, the second Kafka-1, and so on; the start and stop sequence of the pod copy controlled by the StatefulSet is controlled. When the nth pod is operated, the first N-1 pods are already running and ready Good state; the pod in the StatefulSet uses a stable persistent storage volume, implemented by PV or PVC. When deleting the pod, the storage volume associated with the StatefulSet is not deleted by default (for data security); the StatefulSet is bound to be bound to the PV volume. Used to store pod state data, and also used in conjunction with headless services, declared to belong to that headless service;
Upvotes: 12
Reputation: 13867
Deployments and ReplicationControllers are meant for stateless usage and are rather lightweight. StatefulSets are used when state has to be persisted. Therefore the latter use volumeClaimTemplates
/ claims on persistent volumes to ensure they can keep the state across component restarts.
So if your application is stateful or if you want to deploy stateful storage on top of Kubernetes use a StatefulSet.
If your application is stateless or if state can be built up from backend-systems during the start then use Deployments.
Further details about running stateful application can be found in 2016 kubernetes' blog entry about stateful applications
Upvotes: 184
Reputation: 6004
Deployment - You specify a PersistentVolumeClaim that is shared by all pod replicas. In other words, shared volume.
The backing storage obviously must have ReadWriteMany or ReadOnlyMany accessMode if you have more than one replica pod.
StatefulSet - You specify a volumeClaimTemplates so that each replica pod gets a unique PersistentVolumeClaim associated with it. In other words, no shared volume.
Here, the backing storage can have ReadWriteOnce accessMode.
StatefulSet is useful for running things in cluster e.g Hadoop cluster, MySQL cluster, where each node has its own storage.
Upvotes: 164