How to add/remove PKCS7 padding from an AES encrypted string?

Click Upvote picture Click Upvote · Sep 6, 2011 · Viewed 39.2k times · Source

I'm trying to encrypt/decrypt a string using 128 bit AES encryption (ECB). What I want to know is how I can add/remove the PKCS7 padding to it. It seems that the Mcrypt extension can take care of the encryption/decryption, but the padding has to be added/removed manually.

Any ideas?

Answer

Paŭlo Ebermann picture Paŭlo Ebermann · Sep 6, 2011

Let's see. PKCS #7 is described in RFC 5652 (Cryptographic Message Syntax).

The padding scheme itself is given in section 6.3. Content-encryption Process. It essentially says: append that many bytes as needed to fill the given block size (but at least one), and each of them should have the padding length as value.

Thus, looking at the last decrypted byte we know how many bytes to strip off. (One could also check that they all have the same value.)

I could now give you a pair of PHP functions to do this, but my PHP is a bit rusty. So either do this yourself (then feel free to edit my answer to add it in), or have a look at the user-contributed notes to the mcrypt documentation - quite some of them are about padding and provide an implementation of PKCS #7 padding.


So, let's look on the first note there in detail:

<?php

function encrypt($str, $key)
 {
     $block = mcrypt_get_block_size('des', 'ecb');

This gets the block size of the used algorithm. In your case, you would use aes or rijndael_128 instead of des, I suppose (I didn't test it). (Instead, you could simply take 16 here for AES, instead of invoking the function.)

     $pad = $block - (strlen($str) % $block);

This calculates the padding size. strlen($str) is the length of your data (in bytes), % $block gives the remainder modulo $block, i.e. the number of data bytes in the last block. $block - ... thus gives the number of bytes needed to fill this last block (this is now a number between 1 and $block, inclusive).

     $str .= str_repeat(chr($pad), $pad);

str_repeat produces a string consisting of a repetition of the same string, here a repetition of the character given by $pad, $pad times, i.e. a string of length $pad, filled with $pad. $str .= ... appends this padding string to the original data.

     return mcrypt_encrypt(MCRYPT_DES, $key, $str, MCRYPT_MODE_ECB);

Here is the encryption itself. Use MCRYPT_RIJNDAEL_128 instead of MCRYPT_DES.

 }

Now the other direction:

 function decrypt($str, $key)
 {   
     $str = mcrypt_decrypt(MCRYPT_DES, $key, $str, MCRYPT_MODE_ECB);

The decryption. (You would of course change the algorithm, as above). $str is now the decrypted string, including the padding.

     $block = mcrypt_get_block_size('des', 'ecb');

This is again the block size. (See above.)

     $pad = ord($str[($len = strlen($str)) - 1]);

This looks a bit strange. Better write it in multiple steps:

    $len = strlen($str);
    $pad = ord($str[$len-1]);

$len is now the length of the padded string, and $str[$len - 1] is the last character of this string. ord converts this to a number. Thus $pad is the number which we previously used as the fill value for the padding, and this is the padding length.

     return substr($str, 0, strlen($str) - $pad);

So now we cut off the last $pad bytes from the string. (Instead of strlen($str) we could also write $len here: substr($str, 0, $len - $pad).).

 }

?>

Note that instead of using substr($str, $len - $pad), one can also write substr($str, -$pad), as the substr function in PHP has a special-handling for negative operands/arguments, to count from the end of the string. (I don't know if this is more or less efficient than getting the length first and and calculating the index manually.)

As said before and noted in the comment by rossum, instead of simply stripping off the padding like done here, you should check that it is correct - i.e. look at substr($str, $len - $pad), and check that all its bytes are chr($pad). This serves as a slight check against corruption (although this check is more effective if you use a chaining mode instead of ECB, and is not a replacement for a real MAC).


(And still, tell your client they should think about changing to a more secure mode than ECB.)