shva
shva

Reputation: 539

Understanding two different ways of implementing CRC generation with LFSR

There are two ways of implementing CRC generation with linear feedback shift registers (LFSR), as shown in this figure CRC LFSR. The coefficients of generator polynomial in this picture are 100111, and the red "+" circles are exclusive-or operators. The initialization register values are 00000 for both.

For example, if the input data bit stream is 10010011, both A and B will give CRC checksum of 1010. The difference is A finishes with 8 shifts, while B with 8+5=13 shifts because of the 5 zeros appended to the input data. I can understand B very easily since it closely mimics the modulo-2 division. However, I can not understand mathematically how A can give the same result with 5 less shifts. I heard people were talking A took advantage of the pre-appending zeros, but I didn't get it. Can anyone explain it to me? Thanks!

Upvotes: 7

Views: 14304

Answers (3)

Dave Null
Dave Null

Reputation: 1

Simpler intuitive (I hope) non-math explanation ...

Those diagrams of shift register implementations are exactly the same, i.e. one of them is "wrong" (not what you were trying to demonstrate, but I understand what your intention was).

Consider only diagram (B) for the following, but either will work for this.

[This would be easier to understand using CRC-8: C(x) = x^8 + x^2 + x^1 + x^0]

The LFSR is 5 bits long, and the message is 8 bits long, so it will take 8 shifts to get the message completely shifted-in. The last 3 bits of the message don't cause feedback until they are shifted-out, so an additional 5 zero bits have to be shifted-in to cause feedback. Voila, 13 shifts!

To answer why in one algorithm's case more shifts are needed, the difference is the initial LFSR value used!

More Shifts Algorithm (as above)

Set the initial LFSR value (say to 0, but it doesn't matter). The message is then shifted-in and it will have to be followed by 5 extra zero bits to get feedback on all the message bits.

Less Shifts Algorithm

Set the initial LFSR value to be XORed with 5 bits of the message, then the step of shifting-in the message is already (mostly) complete, effectively 5 shifts already done!

Now shift in the 3 remaining message bits followed by 5 zero bits = 8 shifts!

The message is longer than the LFSR, so 3 zero bits can be appended to the initial LFSR value before being XORed with the message.

Setting the LFSR value in this way is doing two things at once: setting the LFSR initial value and effectively shifting-in 5 bits of the message.

In both cases, effectively 13 bits are shifted!

Examples follow for CCITT CRC-8 (in Perl, should be able to translate to C or anything):

#!env perl

use strict;

my @crc8_tbl = (
    0x00, 0x07, 0x0E, 0x09, 0x1C, 0x1B, 0x12, 0x15,
    0x38, 0x3F, 0x36, 0x31, 0x24, 0x23, 0x2A, 0x2D,
    0x70, 0x77, 0x7E, 0x79, 0x6C, 0x6B, 0x62, 0x65,
    0x48, 0x4F, 0x46, 0x41, 0x54, 0x53, 0x5A, 0x5D,
    0xE0, 0xE7, 0xEE, 0xE9, 0xFC, 0xFB, 0xF2, 0xF5,
    0xD8, 0xDF, 0xD6, 0xD1, 0xC4, 0xC3, 0xCA, 0xCD,
    0x90, 0x97, 0x9E, 0x99, 0x8C, 0x8B, 0x82, 0x85,
    0xA8, 0xAF, 0xA6, 0xA1, 0xB4, 0xB3, 0xBA, 0xBD,
    0xC7, 0xC0, 0xC9, 0xCE, 0xDB, 0xDC, 0xD5, 0xD2,
    0xFF, 0xF8, 0xF1, 0xF6, 0xE3, 0xE4, 0xED, 0xEA,
    0xB7, 0xB0, 0xB9, 0xBE, 0xAB, 0xAC, 0xA5, 0xA2,
    0x8F, 0x88, 0x81, 0x86, 0x93, 0x94, 0x9D, 0x9A,
    0x27, 0x20, 0x29, 0x2E, 0x3B, 0x3C, 0x35, 0x32,
    0x1F, 0x18, 0x11, 0x16, 0x03, 0x04, 0x0D, 0x0A,
    0x57, 0x50, 0x59, 0x5E, 0x4B, 0x4C, 0x45, 0x42,
    0x6F, 0x68, 0x61, 0x66, 0x73, 0x74, 0x7D, 0x7A,
    0x89, 0x8E, 0x87, 0x80, 0x95, 0x92, 0x9B, 0x9C,
    0xB1, 0xB6, 0xBF, 0xB8, 0xAD, 0xAA, 0xA3, 0xA4,
    0xF9, 0xFE, 0xF7, 0xF0, 0xE5, 0xE2, 0xEB, 0xEC,
    0xC1, 0xC6, 0xCF, 0xC8, 0xDD, 0xDA, 0xD3, 0xD4,
    0x69, 0x6E, 0x67, 0x60, 0x75, 0x72, 0x7B, 0x7C,
    0x51, 0x56, 0x5F, 0x58, 0x4D, 0x4A, 0x43, 0x44,
    0x19, 0x1E, 0x17, 0x10, 0x05, 0x02, 0x0B, 0x0C,
    0x21, 0x26, 0x2F, 0x28, 0x3D, 0x3A, 0x33, 0x34,
    0x4E, 0x49, 0x40, 0x47, 0x52, 0x55, 0x5C, 0x5B,
    0x76, 0x71, 0x78, 0x7F, 0x6A, 0x6D, 0x64, 0x63,
    0x3E, 0x39, 0x30, 0x37, 0x22, 0x25, 0x2C, 0x2B,
    0x06, 0x01, 0x08, 0x0F, 0x1A, 0x1D, 0x14, 0x13,
    0xAE, 0xA9, 0xA0, 0xA7, 0xB2, 0xB5, 0xBC, 0xBB,
    0x96, 0x91, 0x98, 0x9F, 0x8A, 0x8D, 0x84, 0x83,
    0xDE, 0xD9, 0xD0, 0xD7, 0xC2, 0xC5, 0xCC, 0xCB,
    0xE6, 0xE1, 0xE8, 0xEF, 0xFA, 0xFD, 0xF4, 0xF3,
);

sub crc8fast
{
    return $crc8_tbl[($_[0] ^ $_[1]) & 0xFF];
}

sub crc8slow
{
    my @c = ();
    my $r = 0;

    #
    # Input 8 data bits
    #

    for (my $i = 0; $i < 8; $i++) {
            push @c, ((($_[0] ^ $_[1]) & (1 << $i)) ? 1: 0);
    }

    #
    # Append 8 zero bits
    #

    for (my $i = 0; $i < 8; $i++) {
            #
            # C(x) = x^8 + x^2 + x^1 + x^0
            #

            my $out = $c[7];
            $c[7] = $c[6];
            $c[6] = $c[5];
            $c[5] = $c[4];
            $c[4] = $c[3];
            $c[3] = $c[2];
            $c[2] = $c[1] ^ $out;
            $c[1] = $c[0] ^ $out;
            $c[0] = 0 ^ $out;
    }

    #
    # Collect result bits
    #

    for (my $i = 0; $i < 8; $i++) {
            $r <<= 1;
            $r += $c[7 - $i];
    }

    return $r;
}

sub crc8slower
{
    my @c = ();
    my @d = ();
    my $r = 0;

    for (my $i = 0; $i < 8; $i++) {
            push @c, 0;
    }

    for (my $i = 0; $i < 8; $i++) {
            push @d, ((($_[0] ^ $_[1]) & (1 << $i)) ? 1: 0);
    }

    #
    # Input 8 data bits
    #

    for (my $i = 0; $i < 8; $i++) {
            #
            # C(x) = x^8 + x^2 + x^1 + x^0
            #

            my $out = $c[7];
            $c[7] = $c[6];
            $c[6] = $c[5];
            $c[5] = $c[4];
            $c[4] = $c[3];
            $c[3] = $c[2];
            $c[2] = $c[1] ^ $out;
   }

    #
    # Append 8 zeros
    #

    for (my $i = 0; $i < 8; $i++) {
            my $out = $c[7];
            $c[7] = $c[6];
            $c[6] = $c[5];
            $c[5] = $c[4];
            $c[4] = $c[3];
            $c[3] = $c[2];
            $c[2] = $c[1] ^ $out;
            $c[1] = $c[0] ^ $out;
            $c[0] = 0 ^ $out;
    }

    #
    # Collect result bits
    #

    for (my $i = 0; $i < 8; $i++) {
            $r <<= 1;
            $r += $c[7 - $i];
    }

#   printf("### 0x%02X 0x%02X = 0x%02X\n", $_[0], $_[1], $r);

    return $r;
}

if (1) {
    printf("uint8_t crc8_tbl[] = {\n");
    for (my $i = 0; $i < 256; $i++) {
            printf(" ") unless (($i % 8) == 0);
            printf("\t") if (($i % 8) == 0);
            printf("0x%02X,", crc8slower(0x00, $i));
            printf("\n") if (($i % 8) == 7);
    }

my $c;

$c = 0xFF;
$c = crc8slower($c, 0x01);
$c = crc8slower($c, 0x02);
$c = crc8slower($c, 0x04);
$c = crc8slower($c, 0x08);
$c = crc8slower($c, 0x10);
$c = crc8slower($c, 0x20);
$c = crc8slower($c, 0x40);
$c = crc8slower($c, 0x80);
printf("crc8slower: 0x%02X\n", $c);
die(sprintf("Error: Wrong result. Expected 0x3A but calculated 0x%02X\n", $c)) unless ($c == 0x3A);

$c = 0xFF;
$c = crc8slow($c, 0x01);
$c = crc8slow($c, 0x02);
$c = crc8slow($c, 0x04);
$c = crc8slow($c, 0x08);
$c = crc8slow($c, 0x10);
$c = crc8slow($c, 0x20);
$c = crc8slow($c, 0x40);
$c = crc8slow($c, 0x80);
printf("crc8slow: 0x%02X\n", $c);
die(sprintf("Error: Wrong result. Expected 0x3A but calculated 0x%02X\n", $c)) unless ($c == 0x3A);

$c = 0xFF;
$c = crc8fast($c, 0x01);
$c = crc8fast($c, 0x02);
$c = crc8fast($c, 0x04);
$c = crc8fast($c, 0x08);
$c = crc8fast($c, 0x10);
$c = crc8fast($c, 0x20);
$c = crc8fast($c, 0x40);
$c = crc8fast($c, 0x80);
printf("crc8fast: 0x%02X\n", $c);
die(sprintf("Error: Wrong result. Expected 0x3A but calculated 0x%02X\n", $c)) unless ($c == 0x3A);

# # # #

Upvotes: 0

moibrahim
moibrahim

Reputation: 89

You can say that architecture (A) is implementing the modulo division by aligning MSB of the polyn with MSB of Message, so it is implementing something like the following (in my example I have another crc polyn actually):

enter image description here

But in Architecture (B), you can say we try to predict the MSB of the Message, so we align MSB of CRC polyn with MSB-1 of the message, something like the following:

enter image description here

I can recommend details about this operation in this tutorial

Upvotes: 0

Kanad
Kanad

Reputation: 61

Here is my quick understanding.

Let M(x) be the input message of order m (i.e. has m+1 bits) and G(x) be the CRC polynomial of order n. CRC result for such a message is given by

C(x) = (M(x) * xn) % G(x)

This is what the circuit B is implementing. The additional 5 cycles it takes is because of the xn operation.

Instead of following this approach, circuit A tries to do something smarter. Its trying to solve the question

If C(x) is the CRC of M(x), what would be the CRC for message {M(x), D}

where D is the new bit. So its trying to solve one bit at a time instead of entire message as in case of circuit b. Hence circuit A will take just 8 cycles for a 8 bit message.

Now since you already understand why circuit B looks the way it does, lets look at circuit A. The math, specifically for your case, for the effect of adding bit D to message M(x) on CRC is as below

Let current CRC C(x) be {c4, c3, c2, c1, c0} and new bit that is shifted in be D
NewCRC = {M(x), D}*x5) % G(x) = (({M(x), 0} * x5) % G(x)) XOR ((D * x5) % G(x))
which is ({c3, c2, c1, c0, 0} XOR {0, 0, c4, c4, c4}) XOR ({0, 0, D, D, D})
which is {c3, c2, c1^c4^D, c0^c4^D, c4^D}

i.e. the LFSR circuit A.

Upvotes: 6

Related Questions