Reputation: 1855
I worked with python when writing my apache beam pipelines. I noticed some limits such as dealing with late data etc. I want to know if there are other limits or advantages comparing to Java
Upvotes: 3
Views: 1729
Reputation: 22553
As of Fall 2019 we can consider the python SDK to provide a subset of features of the java one.
You have fewer I/O transforms available to you (the possibility for integrations with other systems, data stores, message queues, etc.). The docs provide a list of those supported in java vs python here: https://beam.apache.org/documentation/io/built-in/
You also have fewer aggregation transforms to work with (for example, Min and Max are missing on the python side), though this is getting better as people contribute back to the community (see https://issues.apache.org/jira/browse/BEAM-6695).
In my personal experience the lack of sql database connectivity was the deal breaker that makes me write my pipelines in java, well kotlin, actually :).
Upvotes: 6