Rim
Rim

Reputation: 1855

What's the difference between Python and java when dealing with apache beam framework?

I worked with python when writing my apache beam pipelines. I noticed some limits such as dealing with late data etc. I want to know if there are other limits or advantages comparing to Java

Upvotes: 3

Views: 1729

Answers (1)

Robert Moskal
Robert Moskal

Reputation: 22553

As of Fall 2019 we can consider the python SDK to provide a subset of features of the java one.

You have fewer I/O transforms available to you (the possibility for integrations with other systems, data stores, message queues, etc.). The docs provide a list of those supported in java vs python here: https://beam.apache.org/documentation/io/built-in/

You also have fewer aggregation transforms to work with (for example, Min and Max are missing on the python side), though this is getting better as people contribute back to the community (see https://issues.apache.org/jira/browse/BEAM-6695).

In my personal experience the lack of sql database connectivity was the deal breaker that makes me write my pipelines in java, well kotlin, actually :).

Upvotes: 6

Related Questions