How to fix Unicode encode error using the hashlib module?

Nate picture Nate · Jul 13, 2011 · Viewed 32.8k times · Source

After multiple searches I have not been able to determine how to avoid an error stating: "Unicode-objects must be encoded before hashing" when using this code:

    pwdinput = input("Now enter a password:")
    pwd = hashlib.sha1()
    pwd.update(pwdinput)
    pwd = pwd.hexdigest()

How can I get past that error? How do you encode Unicode-objects?

Answer

JAB picture JAB · Jul 13, 2011
pwdinput = input("Now enter a password:").encode('utf-8') # or whatever encoding you wish to use

Assuming you're using Python 3, this will convert the Unicode string returned by input() into a bytes object encoded in UTF-8, or whatever encoding you wish to use. Previous versions of Python do have it as well, but their handling of Unicode vs. non-Unicode strings was a bit messy, whereas Python 3 has an explicit distinction between Unicode strings (str) and immutable sequences of bytes that may or may not represent ASCII characters (bytes).

http://docs.python.org/library/stdtypes.html#str.encode
http://docs.python.org/py3k/library/stdtypes.html#str.encode