Stefan Ss
Stefan Ss

Reputation: 65

Apache Crunch Job On AWS EMR using Oozie

Context:

Issue:

Things I've seen:

Exemple of oozie pipeline:

Upvotes: 0

Views: 23

Answers (1)

Stefan Ss
Stefan Ss

Reputation: 65

The issue was with the fs.defaultFS hadoop property. We were using viewfs and the output paths that were given to apache crunch were prefixed with viewfs:// . Because of this it was not able to write to HDFS. So we set the defaultFS to hdfs:// for the writing phase. The reading is from s3 bucket which is mounted as /folder_name on hdfs. For the reading phase the files had to be prefixed with viewfs://.

Upvotes: 0

Related Questions