This example works fine example:
import hashlib
m = hashlib.md5()
m.update(b"Nobody inspects")
r= m.digest()
print(r)
Now, I want to do the same thing but with a variable: var= "hash me this text, please"
. How could I do it following the same logic of the example ?
The hash.update()
method requires bytes, always.
Encode unicode text to bytes first; what you encode to is a application decision, but if all you want to do is fingerprint text for then UTF-8 is a great choice:
m.update(var.encode('utf8'))
The exception you get when you don't is quite clear however:
>>> import hashlib
>>> hashlib.md5().update('foo')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Unicode-objects must be encoded before hashing
If you are getting the hash of a file, open the file in binary mode instead:
from functools import partial
hash = hashlib.md5()
with open(filename, 'rb') as binfile:
for chunk in iter(binfile, partial(binfile.read, 2048)):
hash.update(chunk)
print hash.hexdigest()