Reputation: 237
I want to create an airflow job to export hdfs file stored in S3 to local machine. Which airflow operator could be used for this
Upvotes: 1
Views: 977
Reputation: 5243
There is no particular Airflow operator that can fully satisfy your needs, however as for mine I see two options(ways) how to potentially address this:
boto3
library. However, probably you will not find the suitable method through the list of contains, that you might be interesting in. But recently I've discovered S3_to_hive_operator, after inspecting the entire structure and source code, I've found execute()
Python function that triggers boto3
download_fileobj() method, downloading file from S3 bucket to local drive. Therefore, you can adopt custom Airflow Operator, supplying it with partially modified execute()
function in the particular S3_hook
method.Hope it can be helpful for you research.
Upvotes: 2