user3197263
user3197263

Reputation: 81

Mounting an existing volume from a Kubeflow Pipeline: kfp.VolumeOP creates a new volume instead of creating a PVC to existing volume

I am new to KubeFlow and trying to port / adapt an existing solution to run in KubeFlow pipelines. The issue I am solving now is that the existing solution shared data via a mounted volume. I know this is not the best practice for components exchanging data in KubeFlow however this will be a temporary proof of concept and I have no other choice.

I am facing issues with accessing an existing Volume from the pipeline. I am basically running the code from KubeFlow documentation here, but pointing to an existing K8S Vo

def volume_op_dag():
vop = dsl.VolumeOp(
    name="shared-cache",
    resource_name="shared-cache",
    size="5Gi",
    modes=dsl.VOLUME_MODE_RWO
)

The Volume shared-cache exists:

enter image description here

However when I run the pipeline a new volume is created:

enter image description here

What am I doing wrong? I obviously don't want to create a new volume every time I run the pipeline but instead mount an existing one.

Edit: Adding KubeFlow versions:

Upvotes: 3

Views: 1848

Answers (2)

muss
muss

Reputation: 11

You can use already existing volume using the following step:

 volume_name = 'already_existing_volume name'

#Instead of using this (which every time creates a new volume),
 task = create_step_prepare_data().add_pvolumes({data_path: vop.volume})

# use this (just by adding dsl.PipelineVolume(pvc=volume_name))
 task = create_step_prepare_data().add_pvolumes({data_path: dsl.PipelineVolume(pvc=volume_name)})

Upvotes: 1

Alan
Alan

Reputation: 85

Have a look at the function kfp.onperm.mount_pvc. You can find values for the arguments pvc_name and volume_name via the console command kubectl -n <your-namespace> get pvc. The way you use it is by writing the component as if the volume is already mounted and following the example from the doc when binding it in the pipeline:

train = train_op(...)
train.apply(mount_pvc('claim-name', 'pipeline', '/mnt/pipeline'))

Also note, that both the volume and the pipeline must be in the same namespace.

Upvotes: 1

Related Questions