Reputation: 11
Is there any way to set the number of parallelism while doing the parDo transformation in the Apache Beam using the python libraries?
Code :
xmls = contracts | 'Get XML' >> beam.ParDo(get_xml())
Upvotes: 1
Views: 471
Reputation: 839
Beam model does sharding on data but it does not rely on pre-determined sharding number, thus it leaves no interface to allow specify it on ParDo. One of Beam runner, Cloud Dataflow, for example, can do liquid sharding and auto-scaling because of this model.
Upvotes: 1