Reputation: 684
I am exploring Vertex AI Pipelines for running machine learning training jobs. The kubeflow pipeline docs are clear about how to parameterize the commands/arguments of a container.
Is it also possible to pass a input to an environmental variable or image name of a component? This swagger schema for a component suggests that this can be done, but this example fails:
implementation:
container:
image: {concat: ["us.gcr.io/vcm-ml/emulator", {inputValue: tag}]
# command is a list of strings (command-line arguments).
# The YAML language has two syntaxes for lists and you can use either of them.
# Here we use the "flow syntax" - comma-separated strings inside square brackets.
command: [
python3,
# Path of the program inside the container
/pipelines/component/src/program.py,
--input1-path,
{inputPath: input_1},
--param1,
--output1-path,
]
env:
NAME: {inputValue: env}
inputs:
- {name: tag, type: String}
- {name: env, type: String}
- {name: input_1, type: String, description: 'Data for input_1'}
Is passing an {inputValue}
to container.env
or container.tag
supported. Alternatively, is it possible to add an environment variable or change the image name using the V2 python DSL.
Upvotes: 0
Views: 1437
Reputation: 6811
Sorry for the confusion.
Unfortunately, the JsonSchema is wrong here (i.e. it differs from the implementation). Same with the image
.
The env implementation uses static map. And image
is static as well.
In Kubeflow Pipelines (v1) there is a chance you might be able to set environment varible to a dynamic value after you create component instance. But This probably won't work in Vertex Pipelines.
my_task = my_op(...)
my_task.container.add_env_variable(V1EnvVar(name='MSG', value=task1.outputs["out1"]))
If this does not work you can create a GitHub issue in the KFP repo regarding the env support.
For image we usually advice to have separate component files for different images.
A workaround would be a small wrapper script that sets the variable:
inputs:
- {name: tag, type: String}
- {name: env, type: String}
- {name: input_1, type: String, description: 'Data for input_1'}
implementation:
container:
image: "us.gcr.io/vcm-ml/emulator"
command:
- sh
- -ec
- 'NAME="$0" "$@"' # Set NAME to the first arg and execute the rest
- {inputValue: env}
- python3
# Path of the program inside the container
- /pipelines/component/src/program.py
- --input1-path
- {inputPath: input_1}
- --param1
- --output1-path
Upvotes: 1