Reputation: 11
I am having difficulty trying to understand how to pass a result from a container as an output artifact. I understand that we need to write the output to a file but i need some example how to do it.
https://www.kubeflow.org/docs/components/pipelines/sdk-v2/component-development/
This is the last part of the python container program where i save the url
of model file on GCS onto output.txt
.
with open('./output.txt', 'w') as f:
logging.info(f"Model path url is in {'./output.txt'}")
f.write(model_path)
This is the component .yaml
file
name: Dummy Model Training
description: Train a dummy model and save to GCS
inputs:
- name: input_url
description: 'Input csv url.'
type: String
- name: gcs_url
description: 'GCS bucket url.'
type: String
outputs:
- name: gcs_model_path
description: 'Trained model path.'
type: String
implementation:
container:
image: ${CONTAINER_REGISTRY}
command: [
python, ./app/trainer.py,
--input_url, {inputValue: input_url},
--gcs_url, {inputValue: gcs_url},
]
Upvotes: 1
Views: 1030
Reputation: 66
First of all, your dummy component is missing reference to the output. You need to use {outputPath: <output_name>}
or {outputUri: <output_name>}
to pass it into the container, so that you container code can write data to this system generated path or URI ("gs://..."). To fix your component yaml, it can be:
name: Dummy Model Training
description: Train a dummy model and save to GCS
inputs:
- name: input_url
description: 'Input csv url.'
type: String
- name: gcs_url
description: 'GCS bucket url.'
type: String
outputs:
- name: gcs_model_path
description: 'Trained model path.'
type: String
implementation:
container:
image: ${CONTAINER_REGISTRY}
command: [
python, ./app/trainer.py,
--input_url, {inputValue: input_url},
--gcs_url, {inputValue: gcs_url},
--output_model_path, {outputPath: gcs_model_path}
]
Then your code should write to this passed-in path, instead of './output.txt'
Regarding how to consume the output in a downstream component. Here's a simple yet runnable example, which you can try out on Vertex Pipelines: https://github.com/kubeflow/pipelines/blob/bf2389a66c164457b0e10a820ba484992fd7dd1a/sdk/python/test_data/pipelines/two_step_pipeline.py
Upvotes: 0