Reputation: 11
I have an issue with skipping CRC checks between source and target paths running distcp. I copy and decrypt files on demand and their checksum is different, that is expected.
My command looks like following:
hadoop distcp -skipcrccheck -update -direct sftp://path s3a://path
When hadoop distcp starts, it prints configs and there is skipCRC=true
But job fails with error:
hadoop version - Hadoop 3.2.1-amzn-5
Have anyone had a luck with skipping CRC checks?
I updated EMR to 6.9.0 with hadoop 3.3.3 what was supposed to help based on this Jira. but it didn't and job still fails on CRC validation.
Upvotes: 1
Views: 440