MarcJohnson
MarcJohnson

Reputation: 712

AWS S3 - Etag Sha256 instead of Md5

I want to use Sha256 for the checksum of my objects. But it looks like, that amazon uses md5 in the ETag.

Is there any workaround?

Upvotes: 9

Views: 22328

Answers (3)

nealmcb
nealmcb

Reputation: 13471

This is possible as of 2022-02-25. S3 now features a Checksum Retrieval function GetObjectAttributes:

New – Additional Checksum Algorithms for Amazon S3 | AWS News Blog

Checksum Retrieval – The new GetObjectAttributes function returns the checksum for the object and (if applicable) for each part.

This function supports SHA-1, SHA-256, CRC-32, and CRC-32C for checking the integrity of the transmission.

I'm so glad that they now have alternatives to the sad choice of MD5, which is not optimal for anything in particular and was broken for other purposes long ago. See also related discussion of quirks with their MD5 approach at How to get the md5sum of a file on Amazon's S3.

[And while I'm discussing hashes for various purposes, note that a good one for hash-table lookups and other situations which have some basic randomness and security propertiees is HighwayHash: Fast strong hash functions: SipHash/HighwayHash]

Upvotes: 13

user818510
user818510

Reputation: 3622

Unfortunately, there's no direct way to make S3 use SHA256 for ETag. You could use S3 metadata as a workaround. For this, you can calculate the SHA256 checksum yourself and use user defined S3 object metadata to set it for each upload. User defined metadata is just a set of key-value pairs you can assign to your object. You'll have to set the checksum when you PUT your object and compare it on GET/HEAD object.

More information is available in the S3 documentation:

AWS - Object Key and Metadata

Upvotes: 6

meeza
meeza

Reputation: 704

Please refer: How to calculate SHA-256 checksum of S3 file content

It can be achieved by following steps in Java:

  1. Get InputStream of the S3 Object

InputStream inputStream = amazonS3.getObject(bucket, file).getObjectContent();

  1. Use MessageDigest and DigestInputStream classes for the SHA-256 hash

    public static String getHash(InputStream inputStream, String algorithm) {
        try {
            MessageDigest messageDigest = MessageDigest.getInstance(algorithm);
            DigestInputStream digestInputStream = new DigestInputStream(inputStream, messageDigest);
            byte[] buffer = new byte[4096];
            int count = 0;
            while (digestInputStream.read(buffer) > -1) {
                count++;
            }
            log.info("total read: " + count);
            MessageDigest digest = digestInputStream.getMessageDigest();
            digestInputStream.close();
            byte[] md5 = digest.digest();
            StringBuilder sb = new StringBuilder();
            for (byte b: md5) {
                sb.append(String.format("%02X", b));
            }
            return sb.toString().toLowerCase();
        } catch (Exception e) {
            log.error(e);
        }
        return null;
    }
    

Upvotes: -2

Related Questions