How to calculate crc32 checksum from a string on linux bash

oxidworks picture oxidworks · Jun 28, 2017 · Viewed 20.8k times · Source

I used crc32 to calculate checkcums from sting long time ago, but I cannot remember how I did it.

echo -n "LongString" | crc32    # no output

I found a solution [1] to calculate with Python, but is there not a direct way to calculate from string?

# signed
python -c 'import binascii; print binascii.crc32("LongString")'
python -c 'import zlib; print zlib.crc32("LongString")'
# unsigned
python -c 'import binascii; print binascii.crc32("LongString") % (1<<32)'
python -c 'import zlib; print zlib.crc32("LongString") % (1<<32)'

[1] How to calculate CRC32 with Python to match online results?

Answer

robert picture robert · Mar 23, 2018

I came up against this problem myself and I didn't want to go to the "hassle" of installing crc32. I came up with this, and although it's a little nasty it should work on most platforms, or most modern linux anyway ...

echo -n "LongString" | gzip -c | tail -c8 | hexdump -n4 -e '"%u"'

Just to provide some technical details, gzip uses crc32 in the last 8 bytes and the -c option causes it to output to standard output and tail strips out the last 8 bytes.

hexdump was a little trickier and I had to futz about with it for a while before I came up with something satisfactory, but the format here seems to correctly parse the gzip crc32 as a single 32-bit number:

  • -n4 takes only the relevant first 4 bytes of the gzip footer.
  • '"%u"' is your standard fprintf format string that formats the bytes as a single unsigned 32-bit integer. Notice that there are double quotes nested within single quotes here.

If you want a hexadecimal checksum you can change the format string to '"%08x"' (or '"%08X"' for upper case hex) which will format the checksum as 8 character (0 padded) hexadecimal.

Like I say, not the most elegant solution, and perhaps not an approach you'd want to use in a performance-sensitive scenario but an approach that might appeal given the near universality of the commands used.

The weak point here for cross-platform usability is probably the hexdump configuration, since I have seen variations on it from platform to platform and it's a bit fiddly. I'd suggest if you're using this you should try some test values and compare with the results of an online tool.

EDIT As suggested by @PedroGimeno in the comments, you can pipe the output into od instead of hexdump for identical results without the fiddly options. ... | od -t x4 -N 4 -A n for hex ... | od -t d4 -N 4 -A n for decimal.