AES256 CBC + HMAC SHA256 ensuring confidentiality *and* authentication?

Nuoji picture Nuoji · Mar 8, 2011 · Viewed 13.4k times · Source

I'm thinking of using AES256 CBC + HMAC SHA-256 as a building block for messages that ensures both confidentiality and authentication.

In particular, consider this scenario:

  • Alice is possession a public key belonging to Bob (the key exchange and algorithm is outside the scope of this question). Alice has an identifying key K, also shared with Bob, that she can use to identify herself with. Only Alice and Bob knows the key K.
  • Alice encrypts (nonce || K) using Bob's public key.
  • Bob decrypts the packet and has now has K and nonce.
  • Bob uses SHA-256 with SHA256(K || nonce) to yield a K(e) of 256 bits.
  • Bob uses SHA-256 with SHA256(K || nonce + 1) to yield a K(s) of 256 bits.

Now for every packet Bob wishes to send Alice he performs the following:

  • Create a new random 128 bit IV
  • Encrypts the message using the IV and K(e) as the key.
  • Creates a SHA-256 HMAC with K(s) as key and (IV || Encrypted message) as data.
  • Finally sends (IV || HMAC || Ciphertext) to Alice

Alice has also calculated K(e) and K(s), and follows the following procedure when receiving data from Bob:

  • Split the message into IV, ciphertext and HMAC.
  • Calculate the HMAC using K(s), IV and ciphertext.
  • Compare HMAC with the HMAC sent. If this matches, Alice considers this message authenticated as a message sent by Bob, otherwise it is discarded.
  • Alice decrypts the message using K(e)

Does this protocol ensure that Alice only decrypts messages from Bob, assuming that no one other than Bob can read the encrypted message that Alice sends him encrypted using his public key?

I.e. does messages constructed in this manner ensure both confidentiality and authentication?

Note: If the protocol requires Bob to send multiple messages, this scheme needs a slight modification to avoid replay attacks.

P.S. I am aware of AES-GCM/CCM, but this scheme would work with the basic AES, SHA and HMAC algorithms that are found in most crypto packages. This solution might also be slower, but that too is out of the scope for the question.

Answer

Thomas Pornin picture Thomas Pornin · Mar 8, 2011

Basically you are recreating SSL/TLS. This implies the usual caveats about building your own protocol, and you are warmly encouraged to use TLS with an existing library instead of rewriting your own.

That being said, using AES with CBC for encryption, and HMAC for integrity, is sound. There are combined encryption+integrity modes (that you are aware of), and CBC+HMAC is kind of "old school", but it cannot hurt. You are doing things in the "science-approved" way: encrypt, then MAC the encrypted string (and you do not forget the IV: forgetting the IV is the classical mistake).

Your key derivation may be somewhat weak. It is perfect if SHA-256 behaves like a perfect random oracle, but it is known that SHA-256 does not behave like a random oracle (because of the so-called length-extension attack). It is similar to the reason why HMAC is HMAC, with two nested hash function invocations, instead of simple hashing (once) the concatenation of the MAC key and the data. TLS uses a specific key derivation function (which is called "the PRF" in the TLS specification) which should avoid any trouble. That function is built over SHA-256 (actually, over HMAC/SHA-256) and can be implemented around any typical SHA-256 implementation.

(I am not saying that I know how to attack your key derivation process; only that this is a tricky thing to make properly, and that its security may be assessed only after years of scrutiny from hundreds of cryptographers. Which is why reusing functions and protocols which have already been thoroughly examined is basically a good idea.)

In TLS there are two nonces, called the "client random" and the "server random". In your proposal you only have the "client random". What you lose here, security-wise, is kind of unclear. A cautious strategy would be to include a server random (i.e. another nonce chosen by Bob). The kind of things we want to avoid is when Alice and Bob run the protocol in both directions, and an attacker feeds messages from Alice to Alice herself. Complete analysis of what an attacker could do is complex (it is a whole branch of cryptography); generally speaking, nonces in both directions tend to avoid some issues.

If you send several packets, then you may have some issues about lost packets, duplicated packets ("replay attacks"), and packets arriving out of order. In the context of TLS, this should not "normally" happen because TLS is used over a medium which already ensures (under normal conditions, not counting active attacks) that data is transferred in strict order. Thus, TLS includes a sequence number into the data which goes in the MAC. This would detect any alteration from an attacker, include replay, lost records and record reordering. If possible, you should also use a sequence number.