Reputation: 457
I am writing a batch job with Apache Flink using the DataSet
API. I can read a text file using readTextFile()
but this function just read one file at once.
I would like to be able to consume all the text files in my directory one by one and process them at the same time one by one, in the same function as a batch job with the DataSet
API, if it is possible.
Other option is implement a loop doing multiple jobs, one for each file, instead of one job, with multiples files. But I think this solution is not the best.
Any suggestion?
Upvotes: 1
Views: 854
Reputation: 2921
If I got the documentation right you can read an entire path using ExecutionEnvironment.readTextFile()
. You can find an example here: Word-Count-Batch-Example
References:
Upvotes: 1