Reputation: 83
I have a Flink application that I rely on Table API. I do have a Kafka topic that I create a table. Then, we maintain an S3 object for list of IP addressed and some metadata information.
We also want to create a table on this S3 object. S3 object path is static and does not change, but I can override the S3 object and I want to refresh this table with the new data.
Basically, I have a collection in-memory read from the S3 object. How can I create a table and do join on the Kafka table most efficiently? The table should be refreshed when there is an update in S3 object.
Upvotes: 0
Views: 960
Reputation: 43707
If you create a Table that is backed by the S3 object, using the FileSystem SQL Connector, it might do what you are looking for. Note, however, that file system sources are not fully developed, and you may run into limitations that affect your use case.
You could instead use StreamExecutionEnvironment#readFile
(docs), and convert the DataStream
that it produces into a Table. Note that if you read a file with readFile
while using FileProcessingMode.PROCESS_CONTINUOUSLY
mode, and then modify the file, the entire file will be re-ingested.
Upvotes: 1