Reputation: 11
Trying to get HDFSSensor working. I have set up the hdfs connection and the file is there but it keeps on poking the file and never completes
Poking for file hdfs://user/airflow/stamps/test/ds=2018-10-15/_SUCCESS
code is as below
hdfs_sense_open = HdfsSensor(
task_id='hdfs_sense_open',
filepath='hdfs://user/airflow/stamps/test/ds=2018-10-15/_SUCCESS',
hdfs_conn_id='hdfs_leo',
dag=dag)
Actually it works without file name in the path. I would also like to add one more point when you create hdfs connection, you need to use the hdfs port number not webhdfs port, i.e. 8020 (may be 9000 if it's localhost) but not webhdfs port like 50070
hdfs_sense_open = HdfsSensor(
task_id='hdfs_sense_open',
filepath='/user/airflow/stamps/test/ds=2018-10-15/',
hdfs_conn_id='hdfs_leo',
dag=dag)
Thank you so much both of you for trying to help me out
Upvotes: 1
Views: 5076
Reputation: 45361
Try it with the filepath
set without the protocol. Like:
hdfs_sense_open = HdfsSensor(
task_id='hdfs_sense_open',
filepath='/user/airflow/stamps/test/ds=2018-10-15/_SUCCESS',
hdfs_conn_id='hdfs_leo',
dag=dag)
Upvotes: 1