Nazar Barabash
Nazar Barabash

Reputation: 11

Hadoop distcp does not skip CRC checks

I have an issue with skipping CRC checks between source and target paths running distcp. I copy and decrypt files on demand and their checksum is different, that is expected.

My command looks like following:

hadoop distcp -skipcrccheck -update -direct sftp://path s3a://path

When hadoop distcp starts, it prints configs and there is skipCRC=true

But job fails with error:

hadoop version - Hadoop 3.2.1-amzn-5

Have anyone had a luck with skipping CRC checks?

I updated EMR to 6.9.0 with hadoop 3.3.3 what was supposed to help based on this Jira. but it didn't and job still fails on CRC validation.

Upvotes: 1

Views: 440

Answers (0)

Related Questions