jasonmclose
jasonmclose

Reputation: 1695

crc16 algorithm from C++ to bash

I am trying to implement a CRC16 checksum in bash. I'm porting from an existing piece of C++ code. I'm almost there, but I am getting different answers.

I don't quite see why the checksums between the C++ code and the bash script are different.

Another set of eyes would be a big help.

Here is the C++ code:

uint16_t Encoder::checksum(std::string thestring)
{
    uint8_t d, e, f;
    uint16_t c, r, crccalc;
    c = 0xffff;

    for (unsigned int i = 0; i < thestring.length(); i++)
    {
        d = thestring[i];
        e = c ^ d;
        f = e ^ (e << 4);
        r = (c >> 8) ^ (f << 8) ^ (f << 3) ^ (f >> 4);
        c = r;
    }
    c ^= 0xffff;
    crccalc = c;
    return crccalc;
}

And here is my bash code:

function calc_crc16()
{
    string=$1
    while read -d "" -n 1 ; do astring+=( "$reply" ) ; done <<< "$string"
    
    cnt=${#astring[@]}
    c=0xffff
    
    for ((x=0;x<$cnt;x++)); do
        char=${astring[$x]}
        e=$(($c ^ $char))
        s=$(($e << 4))
        f=$(($e ^ $s))
        t1=$(($c >> 8))
        t2=$(($f << 8))
        t3=$(($f << 3))
        t4=$(($f >> 4))
        r1=$(($t1 ^ $t2 ^ $t3 ^ $t4))
        c=$r1
    done
    c=$c ^ 0xffff
    echo "checksum = $c"
}

Is it going to have something to do with the size of the ints? I'm guessing there's not much I can do about that in bash.

I'm getting an actual number, but it doesn't match the C++, which I know works correctly. does anyone see anything where I may be screwing things up?

Upvotes: 1

Views: 5487

Answers (4)

Max
Max

Reputation: 1

shorter (and faster) version
changes:
- the while loop at the beginning is not needed
- r1 is not needed
- cnt is not needed
- use capital letters for bash variables
- one less "\" (backslash) in char=$(printf …
- removed leading 0s for 0xFF

function calc_crc16() {
  CRC=0xFFFF

  for ((X=0; X<${#1}; X++)); do
    CHAR=$(printf '%d' "'${1:$X:1}")
    E=$(((CRC ^ CHAR) & 0xFF))
    S=$(((E << 4)     & 0xFF))
    F=$(((E ^ S)      & 0xFF))
    CRC=$(((CRC >> 8) ^ (F << 8) ^ (F << 3) ^ (F >> 4)))
  done

  let CRC^=0xFFFF
  printf "0x%X\n" $CRC
}

Upvotes: 0

jasonmclose
jasonmclose

Reputation: 1695

just for future reference, here is the awk script that i came up with.

this works just as fast as the C++ code i have, which is basically instantaneous. the bash takes about 10 seconds to run for the same string. the awk is much faster.

function awk_calc_crc16()
{
    output=$(echo $1 | awk 'function ord(c){return chmap[c];}
    BEGIN{c=65535; for (i=0; i < 256; i++){ chmap[sprintf("%c", i)] = i;}}
    {
        split($0, chars, "");
        for(i = 1; i <= length(chars); i++)
        {
            cval=ord(chars[i])
            e=and(xor(c, ord(chars[i])), 0x00FF);
            s=and(lshift(e, 4), 0x00FF);
            f=and(xor(e, s), 0x00FF);
            r=xor(xor(xor(rshift(c, 8), lshift(f, 8)), lshift(f, 3)), rshift(f, 4));
            c=r;
        }
    }
    END{c=xor(c, 0xFFFF); printf("%hu", c);}')
    echo $output;
}

Upvotes: 3

jasonmclose
jasonmclose

Reputation: 1695

ok. with Sorpigal's help, i've got a working version.

i suspect this could all be done within an awk script, which may run a lot faster. i may try that next.

thank you all for the help. i don't mean to steal the solution here, but i worked on it, and figure it is worth putting up.

anyway, here is a working version:

function calc_crc16()
{
    while read -r -d "" -n 1 ; do astring+=( "$REPLY" ) ; done <<< "$1"

    cnt=${#1}
    c=65535

    for ((x=0;x<$cnt;x++)); do
        char=$(printf '%d' \'"${1:$x:1}")
        e=$(((c ^ char) & 0x00FF))
        s=$(((e << 4) & 0x00FF))
        f=$(((e ^ s) & 0x00FF))
        r1=$(((c >> 8) ^ (f << 8) ^ (f << 3) ^ (f >> 4)))
        c=$r1
    done
    c=$((c ^ 0xffff))
    echo "checksum = $c"
}

Upvotes: 3

sorpigal
sorpigal

Reputation: 26096

First problem is near the top

while read -d "" -n 1 ; do astring+=( "$reply" ) ; done <<< "$string"

$reply is wrong, since you didn't specify a variable name for read the name is $REPLY.

Next error is at the end

c=$c ^ 0xffff

This should be

c=$(($c ^ 0xffff))

At least this way it will run without errors, correctness and appropriateness is something else.

Correctness issues: What if the input string has a space? This will break horribly. Always quote variable exapnsions

Change

char=${astring[$x]}

to

char="${astring[$x]}"

Strangely this rule is different inside $(()) constructs. Your bit operations should reference variables without any $ in these cases

e=$(( c ^ char ))
s=$(( e << 4 ))
f=$(( e ^ s ))
t1=$(( c >> 8 ))
t2=$(( f << 8 ))
t3=$(( f << 3 ))
t4=$(( f >> 4 ))
r1=$(( t1 ^ t2 ^ t3 ^ t4))

And later

c=$(( c ^ 0xffff ))

This will cause variables to be expanded and whitespace not to blow things up.

In general you should also pass -r to read, see help read for what it does.

Why make an extra copy of $1 before processing it into an array? Using

while read -d "" -n 1 ; do astring+=( "$REPLY" ) ; done <<< "$1"

is sufficient.

It's probably not necessary to turn your input into an array before processing. Instead you can slice characters out of the string in your loop, which is closer to what the C++ version is doing. Replace

char="${astring[$x]}"

with

char="${1:$x:1}"

This is operating directly on the function paramter; since we're no longer making a copy of that we also need to get $cnt a different way

cnt=${#1}

But you really have even bigger problems than this, like the fact that a character isn't an integer in bash. To convert you must use the following syntax:

printf '%d' \'a

where a is the character to convert. Inserting this into the context of the script it would be

char=$(printf '%d' \'"${1:$x:1}")

Now we're getting somewhere, but I really must ask you to consider whether all of this is really worth it. Even if you can make it work, what do you gain?

Upvotes: 4

Related Questions