Why do seemingly empty files and strings produce md5sums?

Daniel picture Daniel · Jun 6, 2012 · Viewed 28.4k times · Source

Consider the following:

% md5sum /dev/null
d41d8cd98f00b204e9800998ecf8427e  /dev/null
% touch empty; md5sum empty
d41d8cd98f00b204e9800998ecf8427e  empty
% echo '' | md5sum
68b329da9893e34099c7d8ad5cb9c940  -
% perl -e 'print chr(0)' | md5sum
93b885adfe0da089cdf634904fd59f71  -
% md5sum ''
md5sum: : No such file or directory

First of all, I'm surprised by the output of all these commands. If anything, I would expect the sum to be the same for all of them.

Answer

Graeme picture Graeme · Jun 6, 2012

The md5sum of "nothing" (a zero-length stream of characters) is d41d8cd98f00b204e9800998ecf8427e, which you're seeing in your first two examples.

The third and fourth examples are processing a single character. In the "echo" case, it's a newline, i.e.

$ echo -ne '\n' | md5sum
68b329da9893e34099c7d8ad5cb9c940 -

In the perl example, it's a single byte with value 0x00, i.e.

$ echo -ne '\x00' | md5sum
93b885adfe0da089cdf634904fd59f71 -

You can reproduce the empty checksum using "echo" as follows:

$ echo -n '' | md5sum
d41d8cd98f00b204e9800998ecf8427e -

...and using Perl as follows:

$ perl -e 'print ""' | md5sum
d41d8cd98f00b204e9800998ecf8427e  -

In all four cases, you should expect the same output from checksumming the same data, but different data should produce a wildly different checksum (that's the whole point -- even if it's only a single character that differs.)