oxidworks
oxidworks

Reputation: 1642

How to calculate crc32 checksum from a string on linux bash

I used crc32 to calculate checksums from strings a long time ago, but I cannot remember how I did it.

echo -n "LongString" | crc32    # no output

I found a solution [1] to calculate them with Python, but is there not a direct way to calculate that from a string?

# signed
python -c 'import binascii; print binascii.crc32("LongString")'
python -c 'import zlib; print zlib.crc32("LongString")'
# unsigned
python -c 'import binascii; print binascii.crc32("LongString") % (1<<32)'
python -c 'import zlib; print zlib.crc32("LongString") % (1<<32)'

[1] How to calculate CRC32 with Python to match online results?

Upvotes: 21

Views: 67098

Answers (8)

pixelbeat
pixelbeat

Reputation: 31748

cksum in GNU coreutils >= 9.6 will have the -a 'crc32b' option. Testing it here gives:

$ printf '123456789' | cksum -a crc32b | cut -d ' ' -f1
3421780262

$ printf '123456789' | cksum --raw -a crc32b | basenc --base16
CBF43926

$ printf '123456789' | cksum --raw -a crc32b | basenc --base2msbf
11001011111101000011100100100110

Upvotes: 0

L&#233;a Gris
L&#233;a Gris

Reputation: 19625

Here is a pure Bash implementation:

#!/usr/bin/env bash

declare -i -a CRC32_LOOKUP_TABLE

__generate_crc_lookup_table() {
  local -i -r LSB_CRC32_POLY=0xEDB88320 # The CRC32 polynomal LSB order
  local -i index byte lsb
  for index in {0..255}; do
    ((byte = 255 - index))
    for _ in {0..7}; do # 8-bit lsb shift
      ((lsb = byte & 0x01, byte = ((byte >> 1) & 0x7FFFFFFF) ^ (lsb == 0 ? LSB_CRC32_POLY : 0)))
    done
    ((CRC32_LOOKUP_TABLE[index] = byte))
  done
}
__generate_crc_lookup_table
typeset -r CRC32_LOOKUP_TABLE

crc32_string() {
  [[ ${#} -eq 1 ]] || return
  local -i i byte crc=0xFFFFFFFF index
  for ((i = 0; i < ${#1}; i++)); do
    byte=$(printf '%d' "'${1:i:1}") # Get byte value of character at i
    ((index = (crc ^ byte) & 0xFF, crc = (CRC32_LOOKUP_TABLE[index] ^ (crc >> 8)) & 0xFFFFFFFF))
  done
  echo $((crc ^ 0xFFFFFFFF))
}

printf 'The CRC32 of: %s\nis: %08x\n' "${1}" "$(crc32_string "${1}")"

# crc32_string "The quick brown fox jumps over the lazy dog"
# yields 414fa339

Testing:

bash ./crc32.sh "The quick brown fox jumps over the lazy dog"
The CRC32 of: The quick brown fox jumps over the lazy dog
is: 414fa339

For the glory of it, here is a POSIX shell grammar version:

Since POSIX shell has no table, it uses a hard-coded polynomials argument list.

#!/usr/bin/env sh

__generate_crc_lookup_args() {
  lsbCrc32Poly=3988292384 # 0xEDB88320 The CRC32 polynomal LSB order
  i=0
  while [ "$i" -le 255 ]; do
    byte=$((255 - i))
    for _ in 0 1 2 3 4 5 6 7; do # 8-bit lsb shift
      lsb=$((byte & 0x01))
      byte=$(( ((byte >> 1) & 2147483647) ^ (lsb == 0 ? lsbCrc32Poly : 0) ))
    done
    i=$((i + 1))
    if [ $((i % 8)) -eq 0 ]
    then printf '%10d \\\n' "$byte"
    else printf '%10d ' "$byte"
    fi
  done
  printf \\n
}

# Uncomment to see generated CRC32 lookup arguments that have been hardcoded in crc32Lookup
#__generate_crc_lookup_args && exit

getArg(){ shift "$1"; printf %s\\n "$2";}

crc32Lookup() {

  # CRC32 LOOKUP ARGS
  getArg "$1" \
            0 1996959894 3993919788 2567524794  124634137 1886057615 3915621685 2657392035 \
    249268274 2044508324 3772115230 2547177864  162941995 2125561021 3887607047 2428444049 \
    498536548 1789927666 4089016648 2227061214  450548861 1843258603 4107580753 2211677639 \
    325883990 1684777152 4251122042 2321926636  335633487 1661365465 4195302755 2366115317 \
    997073096 1281953886 3579855332 2724688242 1006888145 1258607687 3524101629 2768942443 \
    901097722 1119000684 3686517206 2898065728  853044451 1172266101 3705015759 2882616665 \
    651767980 1373503546 3369554304 3218104598  565507253 1454621731 3485111705 3099436303 \
    671266974 1594198024 3322730930 2970347812  795835527 1483230225 3244367275 3060149565 \
    1994146192   31158534 2563907772 4023717930 1907459465  112637215 2680153253 3904427059 \
    2013776290  251722036 2517215374 3775830040 2137656763  141376813 2439277719 3865271297 \
    1802195444  476864866 2238001368 4066508878 1812370925  453092731 2181625025 4111451223 \
    1706088902  314042704 2344532202 4240017532 1658658271  366619977 2362670323 4224994405 \
    1303535960  984961486 2747007092 3569037538 1256170817 1037604311 2765210733 3554079995 \
    1131014506  879679996 2909243462 3663771856 1141124467  855842277 2852801631 3708648649 \
    1342533948  654459306 3188396048 3373015174 1466479909  544179635 3110523913 3462522015 \
    1591671054  702138776 2966460450 3352799412 1504918807  783551873 3082640443 3233442989 \
    3988292384 2596254646   62317068 1957810842 3939845945 2647816111   81470997 1943803523 \
    3814918930 2489596804  225274430 2053790376 3826175755 2466906013  167816743 2097651377 \
    4027552580 2265490386  503444072 1762050814 4150417245 2154129355  426522225 1852507879 \
    4275313526 2312317920  282753626 1742555852 4189708143 2394877945  397917763 1622183637 \
    3604390888 2714866558  953729732 1340076626 3518719985 2797360999 1068828381 1219638859 \
    3624741850 2936675148  906185462 1090812512 3747672003 2825379669  829329135 1181335161 \
    3412177804 3160834842  628085408 1382605366 3423369109 3138078467  570562233 1426400815 \
    3317316542 2998733608  733239954 1555261956 3268935591 3050360625  752459403 1541320221 \
    2607071920 3965973030 1969922972   40735498 2617837225 3943577151 1913087877   83908371 \
    2512341634 3803740692 2075208622  213261112 2463272603 3855990285 2094854071  198958881 \
    2262029012 4057260610 1759359992  534414190 2176718541 4139329115 1873836001  414664567 \
    2282248934 4279200368 1711684554  285281116 2405801727 4167216745 1634467795  376229701 \
    2685067896 3608007406 1308918612  956543938 2808555105 3495958263 1231636301 1047427035 \
    2932959818 3654703836 1088359270  936918000 2847714899 3736837829 1202900863  817233897 \
    3183342108 3401237130 1404277552  615818150 3134207493 3453421203 1423857449  601450431 \
    3009837614 3294710456 1567103746  711928724 3020668471 3272380065 1510334235  755167117
}

crc32String() {
  [ $# -gt 0 ] || return
  crc=4294967295 # 0xFFFFFFFF
  i=1
  while [ "$i" -le "${#1}" ]; do
    # Get byte value of character at i
    byte=$(printf '%d' "'$(printf %s "$1" | cut -b $i )")
    bi=$(( bi = (crc ^ byte) & 255 ))
    crc=$(( ($(crc32Lookup $bi) ^ (crc >> 8)) & 4294967295 ))
    i=$((i+1))
  done
  printf %d\\n $((crc ^ 4294967295))
}

printf 'The CRC32 of: %s\nis: %08X\n' "$1" "$(crc32String "$1")"

# crc32_string "The quick brown fox jumps over the lazy dog"
# yields 414FA339

Upvotes: 8

C W&#252;rtz
C W&#252;rtz

Reputation: 864

Or just use the process substitution:

crc32 <(echo -n "LongString")

(EDIT: thx @tor-klingberg)

Upvotes: 31

robert
robert

Reputation: 4867

I came up against this problem myself and I didn't want to go to the "hassle" of installing crc32. I came up with this, and although it's a little nasty it should work on most platforms, or most modern linux anyway ...

echo -n "LongString" | gzip -1 -c | tail -c8 | hexdump -n4 -e '"%u"'

Just to provide some technical details, gzip uses crc32 in the last 8 bytes and the -c option causes it to output to standard output and tail strips out the last 8 bytes. (-1 as suggested by @MarkAdler so we don't waste time actually doing the compression).

hexdump was a little trickier and I had to futz about with it for a while before I came up with something satisfactory, but the format here seems to correctly parse the gzip crc32 as a single 32-bit number:

  • -n4 takes only the relevant first 4 bytes of the gzip footer.
  • '"%u"' is your standard fprintf format string that formats the bytes as a single unsigned 32-bit integer. Notice that there are double quotes nested within single quotes here.

If you want a hexadecimal checksum you can change the format string to '"%08x"' (or '"%08X"' for upper case hex) which will format the checksum as 8 character (0 padded) hexadecimal.

Like I say, not the most elegant solution, and perhaps not an approach you'd want to use in a performance-sensitive scenario but an approach that might appeal given the near universality of the commands used.

The weak point here for cross-platform usability is probably the hexdump configuration, since I have seen variations on it from platform to platform and it's a bit fiddly. I'd suggest if you're using this you should try some test values and compare with the results of an online tool.

EDIT As suggested by @PedroGimeno in the comments, you can pipe the output into od instead of hexdump for identical results without the fiddly options. ... | od -t x4 -N 4 -A n for hex ... | od -t d4 -N 4 -A n for decimal.

Upvotes: 37

jimis
jimis

Reputation: 902

I use cksum and convert to hex using the shell builtin printf:

$ echo -n "LongString"  | cksum | cut -d\  -f1 | xargs echo printf '%0X\\n' | sh
5751BDB2

The cksum command first appeared on 4.4BSD UNIX and should be present in all modern systems.

Upvotes: 10

Woosung
Woosung

Reputation: 31

You can try to use rhash.

Testing:

## install 'rhash'...
$ sudo apt-get install rhash
## test CRC32...
$ echo -n 123456789 | rhash --simple -
cbf43926  (stdin)

Upvotes: 2

Mark Adler
Mark Adler

Reputation: 112442

Your question already has most of the answer.

echo -n 123456789 | python -c 'import sys;import zlib;print(zlib.crc32(sys.stdin.read())%(1<<32))'

correctly gives 3421780262

I prefer hex:

echo -n 123456789 | python -c 'import sys;import zlib;print("%08x"%(zlib.crc32(sys.stdin.read())%(1<<32)))'
cbf43926

Be aware that there are several CRC-32 algorithms: http://reveng.sourceforge.net/crc-catalogue/all.htm#crc.cat-bits.32

Upvotes: 8

slim
slim

Reputation: 41263

On Ubuntu, at least, /usr/bin/crc32 is a short Perl script, and you can see quite clearly from its source that all it can do is open files. It has no facility to read from stdin -- it doesn't have special handling for - as a filename, or a -c parameter or anything like that.

So your easiest approach is to live with it, and make a temporary file.

tmpfile=$(mktemp)
echo -n "LongString" > "$tmpfile"
crc32 "$tmpfile"
rm -f "$tmpfile"

If you really don't want to write a file (e.g. it's more data than your filesystem can take -- unlikely if it's really a "long string", but for the sake for argument...) you could use a named pipe. To a simple non-random-access reader this is indistinguishable from a file:

fifo=$(mktemp -u)
mkfifo "$fifo"
echo -n "LongString" > "$fifo" &
crc32 "$fifo"
rm -f "$fifo"

Note the & to background the process which writes to fifo, because it will block until the next command reads it.

To be more fastidious about temporary file creation, see: https://unix.stackexchange.com/questions/181937/how-create-a-temporary-file-in-shell-script


Alternatively, use what's in the script as an example from which to write your own Perl one-liner (the presence of crc32 on your system indicates that Perl and the necessary module are installed), or use the Python one-liner you've already found.

Upvotes: 8

Related Questions