PyCrypto - How does the Initialization Vector work?

Tim Tisdall picture Tim Tisdall · Feb 5, 2013 · Viewed 17.2k times · Source

I'm trying to understand how PyCrypto works to use in a project but I'm not fully understanding the significance of the Initialization Vector (IV). I've found that I can use the wrong IV when decoding a string and I still seem to get the message back except for the first 16 bytes (the block size). Am simply using it wrong or not understanding something?

Here's a sample code to demonstrate:

import Crypto
import Crypto.Random
from Crypto.Cipher import AES

def pad_data(data):
    if len(data) % 16 == 0:
        return data
    databytes = bytearray(data)
    padding_required = 15 - (len(databytes) % 16)
    databytes.extend(b'\x80')
    databytes.extend(b'\x00' * padding_required)
    return bytes(databytes)

def unpad_data(data):
    if not data:
        return data

    data = data.rstrip(b'\x00')
    if data[-1] == 128: # b'\x80'[0]:
        return data[:-1]
    else:
        return data


def generate_aes_key():
    rnd = Crypto.Random.OSRNG.posix.new().read(AES.block_size)
    return rnd

def encrypt(key, iv, data):
    aes = AES.new(key, AES.MODE_CBC, iv)
    data = pad_data(data)
    return aes.encrypt(data)

def decrypt(key, iv, data):
    aes = AES.new(key, AES.MODE_CBC, iv)
    data = aes.decrypt(data)
    return unpad_data(data)

def test_crypto ():
    key = generate_aes_key()
    iv = generate_aes_key() # get some random value for IV
    msg = b"This is some super secret message.  Please don't tell anyone about it or I'll have to shoot you."
    code = encrypt(key, iv, msg)

    iv = generate_aes_key() # change the IV to something random

    decoded = decrypt(key, iv, code)

    print(decoded)

if __name__ == '__main__':
    test_crypto()

I'm using Python 3.3.

Output will vary on execution, but I get something like this: b"1^,Kp}Vl\x85\x8426M\xd2b\x1aer secret message. Please don't tell anyone about it or I'll have to shoot you."

Answer

SquareRootOfTwentyThree picture SquareRootOfTwentyThree · Feb 6, 2013

The behavior you see is specific to the CBC mode. With CBC, decryption can be visualized in the following way (from wikipedia):

CBC decryption

You can see that IV only contributes to the first 16 bytes of plaintext. If the IV is corrupted while it is in transit to the receiver, CBC will still correctly decrypt all blocks but the first one. In CBC, the purpose of the IV is to enable you to encrypt the same message with the same key, and still get a totally different ciphertext each time (even though the message length may give something away).

Other modes are less forgiving. If you get the IV wrong, the whole message is garbled at decryption. Take CTR mode for instance, where nonce takes almost the same meaning of IV:

CTR mode