Simon Corcos
Simon Corcos

Reputation: 1022

Specify machine type for a single TFX pipeline component in Vertex AI

I'm using TFX to build an AI Pipeline on Vertex AI. I've followed this tutorial to get started, then I adapted the pipeline to my own data which has over 100M rows of time series data. A couple of my components get killed midway because of memory issues, so I'd like to set the memory requirements for these components only. I use KubeflowV2DagRunner to orchestrated and launch the pipeline in Vertex AI with the following code:

runner = tfx.orchestration.experimental.KubeflowV2DagRunner(
    config=tfx.orchestration.experimental.KubeflowV2DagRunnerConfig(
        default_image = 'gcr.io/watch-hop/hop-tfx-covid:0.6.2'
    ),
    output_filename=PIPELINE_DEFINITION_FILE)

_ = runner.run(
    create_pipeline(
        pipeline_name=PIPELINE_NAME,
        pipeline_root=PIPELINE_ROOT,
        data_path=DATA_ROOT, metadata_path=METADATA_PATH))

A similar question has been answered on Stack Overflow, which has led me to a way to set memory requirements in AI Platform, but these configs don't exist anymore in KubeflowV2DagRunnerConfig, so I'm at a dead end.

Any help would be much appreciated.

** EDIT **
We define our components as python functions with the @component decorator, so most of them are custom components. For Training components, I know you can specify the machine type using the tfx.Trainer class as explained in this tutorial, though my question is for custom components that are not doing any training.

Upvotes: 2

Views: 1253

Answers (2)

crawler_in
crawler_in

Reputation: 41

An alternate option to this solution would be using the dataflow beam runner which allows components to be run dataflow cluster via Vertex. I am still to find a way for specifying machine types for custom components

Sample beam input:

BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS = [
--project=  GOOGLE_CLOUD_PROJECT,
--temp_location= GCS_LOCAITON,
--runner=DataflowRunner

]

By now you would be migrating to Vertex AI

Upvotes: 0

Simon Corcos
Simon Corcos

Reputation: 1022

Turns out you can't at the moment but according to this issue, this feature is coming.

An alternative solution is to convert your TFX pipeline to a Kubeflow pipeline. Vertex AI pipelines support kubeflow and with these you can set memory and cpu constraints at the component level.

@component // imported from kfp.dsl
def MyComponent(Input[Dataset] input_data):
  // ...

@pipeline // imported from kfp.dsl
def MyPipeline(...):
  component = MyComponent(...)
  component.set_memory_limit('64G') // alternative to set_memory_request(...)

Upvotes: 3

Related Questions