Zee
Zee

Reputation: 172

Multiple Kubernetes pods sharing the same host-path/pvc will duplicate output

I have a small problem and need to know what is the best way to approach this/solve my issue.

I have deployed few pods on Kubernetes and so far I have enjoyed learning about and working with Kubernetes. Did all the persistent volume, volume claim...etc. and can see my data on the host, as I need those files for further processing.

Now the issue is 2 pods (2 replicas) sharing the same volume claim are writing to the same location on the host, expected, but unfortunately causing the data to be duplicated in the output file.

What I need is:

Please note that I have one node deployment and that's why I'm using hostpath at the moment.

creating pv:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: ls-pv
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/ls-data/my-data2"

claim-pv:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: ls-pv-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi

How I use my pv inside my deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: logstash
  namespace: default
  labels:
    component: logstash
spec:
  replicas: 2
  selector:
    matchLabels:
     component: logstash
#omitted 
        ports:
        - containerPort: 5044
          name: logstash-input
          protocol: TCP
        - containerPort: 9600
          name: transport
          protocol: TCP
        volumeMounts:
        - name: ls-pv-store
          mountPath: "/logstash-data"
      volumes:
      - name: ls-pv-store
        persistentVolumeClaim:
         claimName: ls-pv-claim

Upvotes: 4

Views: 9973

Answers (1)

Janos Lenart
Janos Lenart

Reputation: 27100

Depending on what exactly you are trying to achieve you could use Statefulsets instead of Deployments. Each Pod spawn from the Statefulset's Pod template can have it's own separate PersistentVolumeClaim that is created from the volumeClaimTemplate (see the link for an example). You will need a StorageClass set up for this.

If you are looking for something simpler you write to /mnt/volume/$HOSTNAME from each Pod. This will also ensure that they are using separate files as the hostnames for the Pods are unique.

Upvotes: 6

Related Questions