xuanyue
xuanyue

Reputation: 1428

Make HDFS calulcate checksum for local file.

I'm trying to calculate a local file checksum using hadoop fs -checksum. But it only returns None.

[centos@sandbox tmp]$ hadoop fs -checksum file:///user/centos//a.json file:///user/centos/a.json NONE

I have tried using

hadoop fs -copyFromLocal a.json file:///user/centos/a.json Such that in local folder /user/centos generate a .a.json.crc file. But the result checksum is still returns none.

How to make Hadoop calculate checksum locally?

Upvotes: 1

Views: 1269

Answers (1)

Chris Nauroth
Chris Nauroth

Reputation: 9844

hadoop fs -checksum currently does not have the capability to calculate a checksum on a file from the local file system. Potential workarounds are:

  • Apache JIRA HADOOP-12326 tracks supporting files on the local file system as a target of the hadoop fs -checksum command. If you really need the capability now, then you could potentially download the Hadoop source, apply the patch attached to HADOOP-12326, and create a custom build by following the directions in BUILDING.txt. Please be aware that the patch is not yet approved and committed by the Apache Hadoop community, so use at your own risk.
  • If you are simply looking for a way to carry CRC information with you when you copy a file out of HDFS onto the local file system, then you could pass the -crc argument to the get command.

Example:

hadoop fs -get -crc hello

ls -lrta 
...
-rw-r--r--   1 cnauroth                    cnauroth                       12 Jun 23 15:28 .hello.crc
-rw-r--r--   1 cnauroth                    cnauroth                        6 Jun 23 15:28 hello
...

Upvotes: 1

Related Questions