e-nouri
e-nouri

Reputation: 2626

Why is this CRC not a 2 byte string?

using this string "000016037000" and this CRC16 function, the result is not a 2 byte string, why is that ?

def _crc16(data, bits=8):
    """private method: for calculating the CRC
    based on the standard EN50463-4

    Arguments:
        Input: data in ASCII encoding !
        Output: CRC of the data
    """
    crc = 0xFFFF
    for l in list(data):
        """or exclusive
        bin: gives the binary representation
        int: cast the string to an int with the base2
        ord: gives the ASCII code for the caracter between (0..255)
        """
        crc = crc ^ int(bin(ord(l)), 2)
        for bit in range(0, bits):
            if (crc & 0x0001) == 0x0001:
                crc = ((crc >> 1) ^ 0xA001)
            else:
                crc = crc >> 1

    return _typecasting(crc)


def _typecasting(crc):
    """gives the msb and lsb"""
    msb = hex(crc >> 8)
    lsb = hex(crc & 0x00FF)

    return lsb + msb

data = "000016037000"

print _crc16(data)

This is the result: 0x00xfc, when you strip the '0x' is 0fc ! a CRC16 is supposed to genrate a checksum of 2 Bytes, is it normal that the lsb is 0 ?

Upvotes: 0

Views: 1384

Answers (3)

Yann Vernier
Yann Vernier

Reputation: 15877

You seem to have quite a few unnecessary conversions, and the question is about one of the effects. I'll try to explain them in execution order.

for l in list(data):

Here you convert a string into a list of letters, each of which is a string in itself (Python doesn't use a char data type). The reason this works is because you could iterate on the string itself; just remove the list() call.

    crc = crc ^ int(bin(ord(l)), 2)

As a side note, ord() actually gets us the ordinal number; it's not certain to be ASCII (in fact, no code >127 is in ASCII). Once we have this number, you converted it to a text representation in binary and back; being paired, both conversions are redundant.

msb = hex(crc >> 8)
lsb = hex(crc & 0x00FF)
return lsb + msb

Each call to hex() converts into a hexadecimal representation. As with bin(), this is in the form of a Python numeric literal, so they're each prefixed with 0x. Concatenating them yields a somewhat strange format (though still recoverable, it doesn't resemble any common ones). At this point it might be nice to know what you were aiming for.

One guess is that you wanted a little-endian 16 bit unsigned integer in 4-digit hexadecimal format (byte oriented hexdump). We can express that using Python's standard library:

import binascii, struct
le16hex = binascii.b2a_hex(struct.pack('<H', crc))

Here < marks little endian, H marks an unsigned 16-bit value, and b2a_hex converts from binary to hex. If we just wanted a 4-digit hex value (which incidentally matches the bigendian form) we can use "%04x"%crc.

However, you also ask why the result is not a two byte string. That's because you requested it in hex; struct.pack above produces exactly a two byte string. Combined with your input of an even number of digits, I'm left to wonder if you mean to process binary data (as ord and struct do) or all hexadecimal (or even octal). A little more context is required to understand this.

As for the least significant byte being 0, that's an effect of this particular string; that it is only shown with one digit is because hex() doesn't produce more than necessary. The % formatting operation can produce specific numbers of digits.

Upvotes: 1

gog
gog

Reputation: 11347

Yes, it's normal. It's just like 0 in 10. BTW, your main loop is a bit too verbose, how about:

crc = 0xFFFF
for l in data:
    crc ^= ord(l)
    for bit in range(0, bits):
        if crc & 1:
            crc = (crc >> 1) ^ 0xA001
        else:
            crc >>= 1

The typecasting function, which appears to swap the lsb and msb can be written more concisely as

def byteswap(crc):    
    return (crc >> 8) | (crc & 0x00FF) << 8

Note that to avoid hassle, both functions should work with integers only, no need for hex or bin.

Upvotes: 1

Barmar
Barmar

Reputation: 780673

The hex() function returns a string with 0x on the front of it. So in your typecasting function, you have:

lsb = "0x00"
msb = "0xfc"

When you concatenate them, you get 0x at the front and also in the middle. You should remove the 0x from msb before concatenating:

return lsb + msb[2:]

Then you'll get 0x00fc

Upvotes: 1

Related Questions