Encrypt File using AES and PyCrypto in Python 3

user1445421 picture user1445421 · Jun 8, 2012 · Viewed 12.3k times · Source

I'm using PyCrypto to encrypt a binary file using AES in CBC mode (Python 3.2.3 64-bit and PyCrypto 2.6). Using the code from this: http://eli.thegreenplace.net/2010/06/25/aes-encryption-of-files-in-python-with-pycrypto/

But running into the following error: ValueError: IV must be 16 bytes long.

Here's the code:

def encryptFile(key, in_filename, out_filename=None, chunksize=64*1024):
""" Encrypts a file using AES (CBC mode) with the
    given key.

    key:
        The encryption key - a string that must be
        either 16, 24 or 32 bytes long. Longer keys
        are more secure.

    in_file:
        Input file

    out_file:
        If None, a StringIO will be returned.

    chunksize:
        Sets the size of the chunk which the function
        uses to read and encrypt the file. Larger chunk
        sizes can be faster for some files and machines.
        chunksize must be divisible by 16.
"""
if not out_filename:
    out_filename = in_filename + '.enc'

iv = ''.join(chr(random.randint(0, 0xFF)) for i in range(16))
encryptor = AES.new(key, AES.MODE_CBC, iv)
filesize = os.path.getsize(in_filename)

with open(in_filename, 'rb') as infile:
    with open(out_filename, 'wb') as outfile:
        outfile.write(struct.pack('<Q', filesize))
        outfile.write(iv)

        while True:
            chunk = infile.read(chunksize)
            if len(chunk) == 0:
                break
            elif len(chunk) % 16 != 0:
                chunk += ' ' * (16 - len(chunk) % 16)

            outfile.write(encryptor.encrypt(chunk))

I've tried searching and experimenting but can't seem to get it working. Python is pretty new to me and so is encryption. Any help would be greatly appreciated. Thanks in advance.

Answer

SquareRootOfTwentyThree picture SquareRootOfTwentyThree · Jun 9, 2012

As the PyCrypto API says, the IV must be a byte string, not a text string.

Your piece of code will work fine in Python 2, because they are the same thing (that is, they all are class str, unless you deal with Unicode text). In Python 3 they are two completely different types: bytes and str.

The code should therefore be:

iv = bytes([ random.randint(0,0xFF) for i in range(16)] )

Such code (beside not being cryptographically secure as Federico points out) will not properly work in Python 2 though. The following alternative works fine in both cases, it is secure and it is more efficient:

iv = Random.new().read(16)