Thompson Liu
Thompson Liu

Reputation: 145

Apache Beam with Spark runner, schedule different stage to different servers

For a defined beam pipeline, can we enforce the scheduling such that different pipeline stage are executed on different servers. For example:

    pipeline
      .apply(GenerateSequence.from(0).to(options.getExperimentCount()))
      .apply(ParDo.of(new Stage1()))
      .apply(ParDo.of(new Stage2()))
      .apply(
        "Write Results", 
        TextIO.write().withWindowedWrites().withNumShards(options.getNumShards()).to(options.getOutput())
      );

Is there a way to put stage 1 and 2 on different servers

Upvotes: 0

Views: 105

Answers (1)

Alexey Romanenko
Alexey Romanenko

Reputation: 1443

No, it's not possible since Beam doesn't handle the job scheduling and it's up to Spark or YARN to decide where and how to run a pipeline in more effective way.

Upvotes: 2

Related Questions