G B
G B

Reputation: 755

Getting access to PipelineOptions from a CombineFn in Google cloud dataflow

I need to instantiate use a GcsUtil from within a CombineFn subclass and it looks like I need to hand a PipelineOptions instance to the GcsUtilFactory. However I cannot find a way to retrieve an instance of the PipelineOptions class (unlike in DoFns).

Is there an API to retrieve the current pipeline's options at runtime? Keeping the options in a field doesn't seem to work and blocks the pipeline upload to the dataflow service.

Thanks! G

Upvotes: 0

Views: 159

Answers (1)

Ben Chambers
Ben Chambers

Reputation: 6130

Reading from GCS within the CombineFn is likely to be problematic. For instance, you wouldn't get any of the caching that side-inputs give you.

Depending on what kind of configuration you're trying to do, your best bet is probably to use a ParDo/DoFn before running the Combine.

Separately, it probably does make sense for PipelineOptions to be made accessible from within the CombineFn. I've made a note of this, and we'll take a look.

Upvotes: 1

Related Questions