Lucas
Lucas

Reputation: 589

access persistent volume in kubeflow from different components

I am adding persistent volume claim to my kubeflow pipeline components and I would like to be able to access the volume from the different components so I can store data in it and retreive it from other components.

  1. Volume definition:
vop = kfp.dsl.VolumeOp(
        name="create-pvc",
        resource_name="my-pvc",
        modes=kfp.dsl.VOLUME_MODE_RWO,
        size=volume_size
    )
  1. Mount volume to components:
comp1.add_pvolumes({"/mnt'": vop.volume})
comp2.add_pvolumes({"/mnt'": comp1.pvolume})
  1. I create some data in comp1 and want to read it from comp2
# data read from external source
data=some_data_frame_readed_from_gcs

# data pickled to volume
path = os.path.abspath("data")
with open(path, 'wb') as f:
    pickle.dump(data, f)
print("PATH: {}".format(path))
# output path is: "/mnt/data"

# Now I try to read it from comp2
with open(path, 'rb') as f:
    df = pickle.load(f)

  1. this is the error code showing in kubeflow pipeline:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/data'

Upvotes: 1

Views: 3007

Answers (1)

Ark-kun
Ark-kun

Reputation: 6787

Is there a particular reason you want to use volumes? KFP has built-in data passing mechanisms (See the Data passing for Python)

I'd advice to use the built-in data-passing methods if your data is <10GB in size.

Volumes are less supported and make components not portable. Volumes are not supported on Vertex AI Pipelines for example.

Upvotes: 2

Related Questions