Reputation: 359
I have a use case where I want to implement an AWS Kinesis Data Application with Flink in Java. It will listen to multiple Kinesis streams via the Data Streams API. However, the analysis of those streams will be done in Python (since our data scientists prefer Python).
From this answer, there appears to be support for calling Python UDFs from Java. However, I want to be able to convert an incoming stream to a table, via
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
Table sessionsTable = tableEnv.fromDataStream(inputStream);
...and then have a Python processor that is invoked to process that stream.
I really have 3 questions here:
Upvotes: 1
Views: 519
Reputation: 43707
The starting point in the Flink documentation for learning about using Python with Tables and Datastreams is at https://ci.apache.org/projects/flink/flink-docs-stable/docs/dev/python/overview/.
The Python APIs only provide a subset of what's available from Java; you'll have to look and see if what you need is included.
Not sure about performance, but you can, for example, convert back and forth between Flink Tables and Pandas dataframes.
Upvotes: 0