How to hash a variable in Python?

user3522371 picture user3522371 · Jul 23, 2014 · Viewed 11.2k times · Source

This example works fine example:

import hashlib
m = hashlib.md5()
m.update(b"Nobody inspects")
r= m.digest()
print(r)

Now, I want to do the same thing but with a variable: var= "hash me this text, please". How could I do it following the same logic of the example ?

Answer

Martijn Pieters picture Martijn Pieters · Jul 23, 2014

The hash.update() method requires bytes, always.

Encode unicode text to bytes first; what you encode to is a application decision, but if all you want to do is fingerprint text for then UTF-8 is a great choice:

m.update(var.encode('utf8')) 

The exception you get when you don't is quite clear however:

>>> import hashlib
>>> hashlib.md5().update('foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Unicode-objects must be encoded before hashing

If you are getting the hash of a file, open the file in binary mode instead:

from functools import partial

hash = hashlib.md5()
with open(filename, 'rb') as binfile:
    for chunk in iter(binfile, partial(binfile.read, 2048)):
        hash.update(chunk)
print hash.hexdigest()