Reputation: 428
I want to tightly couple the two pods to run on the same node so that even if one pod is accidentally deleted, it gets rescheduled to the same node where another app resides. This should be the same for both apps. Suppose, I have two pods P1 and P2. Then if the pod P1 is running on the node N1 then the pod P2 also should run on the same node N1. I have implemented this using this manifest.
apiVersion: apps/v1
kind: Deployment
metadata:
name: first-app
spec:
replicas: 1
selector:
matchLabels:
app: first-app
template:
metadata:
labels:
app: first-app
spec:
containers:
- name: first-app
imagePullPolicy: IfNotPresent
image: image1
nodeSelector:
nodegroup: etl
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- first-app
and for the second app
apiVersion: apps/v1
kind: Deployment
metadata:
name: second-app
spec:
replicas: 1
selector:
matchLabels:
app: second-app
template:
metadata:
labels:
app: second-app
spec:
containers:
- name: second-app
imagePullPolicy: IfNotPresent
image: image1
nodeSelector:
nodegroup: etl
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- second-app
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- first-app
topologyKey: "kubernetes.io/hostname"
By this, second-app will always be assigned to the same node where the first-node is running. Now, when both the apps are running, first-app is accidentally removed from the node and assigned to another node, the second node should also evict from the node and assigned to another node where the first app runs.
Another solution might be, when the first-app is out from the node, it gets scheduled to same node where the second-app is running and vice-versa.
But I am not sure how to implement this?
Upvotes: 0
Views: 2445
Reputation: 16140
From a service perspective, this is a strange requirement, and as such, you may not be doing the right thing. It violates some of the fundamental ideas of service-oriented architecture where the level of abstraction is the service, and the 'stuff' underneath is all automated and we're not supposed to care about it. You may actually want to wire all of the containers from both pods in a single new pod to significantly simplify this problem. However...
There are a number of ways to achieve this, but the simplest may be with labels. I'll leave the implementation to you, but basically you can have an init container in both pods label the node, and have the deployment for both pods select the node with the label (nodeSelector
in your yaml). Removing the label when both pods die can be a little more complicated, but still isn't too hard and may not even be necessary.
Another way is to use a mutating web hook (an admission controller) to find the node the other pod is running on and inject the nodeName
property into your pod descriptor. This is also hard, but it means that you don't have to worry about cleaning up labels.
There are other techniques you could use, such as building a simple operator that keeps these things together, but these things tend to get complicated quickly. I'd strongly advise you to take another look at your requirement before doing any of this. It's probably more complicated than you think, because pods are often automatically scheduled to another node or scaled to multiple nodes, making the whole thing a headache.
Upvotes: 2
Reputation: 44687
requiredDuringSchedulingIgnoredDuringExecution
changing this to requiredDuringSchedulingRequiredDuringExecution
will evict second app when first app gets removed from a node. But this feature is not yet implemented. So I think the option here is to write a custom scheduler which implements this.
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
Upvotes: 1