Pascal
Pascal

Reputation: 315

MD5 in Object-C not matching results

I try to hash a string (hex) by using md5. The output string however does not match with like outputs i create in python or javascript (node.js).

The input string:

NSString *input = @"001856413a840871624d6c6f553885b5b16fae345c6bdd44afb26141483bff629df47dbd9ad0";


- (NSString *)md5:(NSString *)input {
// b2b22e766b849b8eb41b22ae85dc49b1
unsigned char md5Buffer[CC_MD5_DIGEST_LENGTH];
const char *cStr = [input UTF8String];
CC_MD5_CTX ctx;
CC_MD5_Init(&ctx);
CC_MD5_Update(&ctx, cStr, [input length]);
CC_MD5_Final(md5Buffer, &ctx);

NSMutableString *output = [NSMutableString stringWithCapacity:CC_MD5_DIGEST_LENGTH * 2];
for(int i = 0; i < CC_MD5_DIGEST_LENGTH; i++)
    [output appendFormat:@"%02x",md5Buffer[i]];

return output;

}

In python i do:

binaryCredentialString = binascii.unhexlify(input)
# Now MD5 hash the binary string    
m = hashlib.md5()
m.update(binaryCredentialString)
# Hex encode the hash to obtain the final credential string
credential = m.hexdigest()

Python gives me the correct md5 output back, object-C does not. Why is this?

Any help is truly appreciated - Pascal

Upvotes: 1

Views: 799

Answers (2)

abarnert
abarnert

Reputation: 365777

This is basically the same answer as dreamlax's, but you apparently didn't understand his answer. Hopefully, I can help (but if so, you should accept his answer).

Your problem is that you're doing an MD5 of the 76-character string @"001856413a840871624d6c6f553885b5b16fae345c6bdd44afb26141483bff629df47dbd9ad0" rather than the 38-byte binary data that string encodes.

In Python, what you feed into the md5 function is this:

binaryCredentialString = binascii.unhexlify(input)

But in ObjC, it's this:

const char *cStr = [input UTF8String];

So, you need an equivalent of unhexlify.

I'd handle this by adding methods -(NSString *)hexlify to NSData and -(NSData *)unhexlify to NSString via categories. Then you can just write this code, which is as readable as the Python:

- (NSString *)md5:(NSString *)input {
    NSData *binaryCredentialString = [input unhexlify];
    NSMutableData *result = [NSMutableData dataWithLength:CC_MD5_DIGEST_LENGTH];
    CC_MD5([binaryCredentialString bytes],
           [binaryCredentialString length],
           [m mutableBytes]);
    return [result hexlify];
}

So, how do you write those categories? Using your code, and the reverse of it, something like this (untested, and needs some validation and error handling, but you should get the idea):

@implementation NSData(hexlify)
- (NSString *)hexlify {
    unsigned char *buf = [self bytes];
    size_t len = [self length];
    NSMutableString *output = [NSMutableString stringWithCapacity:len * 2];
    for(int i = 0; i < len; i++)
        [output appendFormat:@"%02x",buf[i]];
    return [NSData dataWithData:output];
}
@end

static unsigned char unhexchar(char c) {
    if (c <= '9') return c - '0';
    if (c <= 'F') return c - 'A' + 10;
    return c - 'a' + 10;
}

@implementation NSString(hexlify)
- (NSData *)unhexlify {
    const char *cStr = [self UTF8String];
    size_t len = [self length];
    NSMutableData *output = [NSMutableData dataWithLength:len / 2];
    unsigned char *buf = [output mutableBytes];
    for(int i = 0; i < len / 2; i++)
        buf[i] = unhexchar(cStr[i*2]) * 16 + unhexchar(cStr[i*2+1]);
    return [NSString stringWithString:output];
}
@end

But I'm sure with a bit of searching, you can find a clean, tested, and optimized implementation of these two methods instead of writing them yourself. For example, see this question, this one, and various others at SO.

Upvotes: 3

dreamlax
dreamlax

Reputation: 95335

In Python you are decoding the hexadecimal back to binary, but in Objective-C you aren't. Decode your string into an NSMutableData instance and then provide [data bytes] (and [data length]) to the digest function.

Upvotes: 3

Related Questions