I'm trying to download BVLC-trained model and I'm stuck with this error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 110: invalid start byte
I think it's because of the following function (complete code)
# Closure-d function for checking SHA1.
def model_checks_out(filename=model_filename, sha1=frontmatter['sha1']):
with open(filename, 'r') as f:
return hashlib.sha1(f.read()).hexdigest() == sha1
Any idea how to fix this?
You are opening a file that is not UTF-8 encoded, while the default encoding for your system is set to UTF-8.
Since you are calculating a SHA1 hash, you should read the data as binary instead. The hashlib
functions require you pass in bytes:
with open(filename, 'rb') as f:
return hashlib.sha1(f.read()).hexdigest() == sha1
Note the addition of b
in the file mode.
See the open()
documentation:
mode is an optional string that specifies the mode in which the file is opened. It defaults to
'r'
which means open for reading in text mode. [...] In text mode, if encoding is not specified the encoding used is platform dependent:locale.getpreferredencoding(False)
is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.)
and from the hashlib
module documentation:
You can now feed this object with bytes-like objects (normally bytes) using the update() method.