Wednesday, March 4, 2015

Master Secret Generation in TLS/SSL


After reading about the recent FREAK vulnerability and how an attacker could cause a client to happily accept a session using an export cipher suite, even though it did not even offer it in the Client Hello, I was curious how the master secret was generated for these sessions, and if it was any different from any of the more common cipher suites used today. Even more troulbing, it was revealed in a number of articles through some research of the vulnerability that a number of cryptographic systems would re-use the generated 512-bit keys for multiple sessions with different individuals when using one of the many export cipher suites. In context of the vulnerability, these generated keys are used in key exchange, and thus if an attacker was able to factor the key, it would reveal (and subsequently be able to tamper with) the pre-master secret, and thus ultimately the master secret.

In one of my previous articles outlining what is necessary to decrypt a TLS/SSL session that has already completed, I revealed that the master secret is generated by passing the pre-master secret (generated by the client), a non-null terminated static label ("master secret"), and the random values from the Client Hello and Server Hello into a pseudorandom function defined by the protocol:
   master_secret = PRF(pre_master_secret, "master secret",
                       ClientHello.random + ServerHello.random)
                       [0..47];

Much like the previous post, there's a question I wanted answered that started this article:
  • How does the client and server know which pseudorandom function ("PRF") to use? Is it dependent on the cipher suite chosen by the server during session negotiation?
We know how to interpret the structure of a cipher suite, such as TLS_RSA_WITH_AES_128_CBC_SHA:
  • This cipher suite was introduced in TLS 1.0 or later.
  • It specifies to use RSA for key exchange and authentication.
  • It specifies to uses the chain-block cipher AES for bulk symmetric encryption, with a 128-bit key.
  • Lastly, it cites SHA-1 for MAC calculation.
But no where does it state what HMAC is used for the PRF when the Client and Server calculate the master secret.

What I discovered is that it actually has to do more with the TLS protocol in use than the cipher suite chosen in most cases. However, TLS 1.2 adds the capability for it to be defined in the cipher suite.

Note that we're looking at the PRF in the context of the generation of the master secret, but the PRF is used beyond just the calculating of the master secret. It's worthwhile to mention that the master secret is not the final keying material used for the symmetric cipher. TLS takes the additional step of passing the master secret through the defined PRF to calculate the respective keys. SSL 2.0 simply slices what would be our master secret. SSL 2.0 and 3.0 do not have the concept of a defined PRF, but I cited what is the effective equivalent in context of the creation of the master secret.

Let's discuss how the PRF evolved through the TLS versions, starting with the most recent.

TLS 1.2


The PRF in TLS 1.2 uses the SHA256 HMAC over a number of iterations to ultimately generate a necessary number of desired bytes, as described in the RFC:
  This PRF with the SHA-256 hash function is used for all cipher
  suites defined in this document and in TLS documents published
  prior to this document when TLS 1.2 is negotiated.  New cipher
  suites MUST explicitly specify a PRF and, in general, SHOULD use the
  TLS PRF with SHA-256 or a stronger standard hash function.

  First, we define a data expansion function, P_hash(secret, data),
  that uses a single hash function to expand a secret and seed into an
  arbitrary quantity of output:

     P_hash(secret, seed) = HMAC_hash(secret, A(1) + seed) +
                         HMAC_hash(secret, A(2) + seed) +
                         HMAC_hash(secret, A(3) + seed) + ...

  where + indicates concatenation.

  A() is defined as:

     A(0) = seed
     A(i) = HMAC_hash(secret, A(i-1))

  P_hash can be iterated as many times as necessary to produce the
  required quantity of data. 

Now that the hashing function is defined, which you'll see is common in historical versions of TLS, the PRF can now be defined as follows, which is unique to TLS1.2:
  TLS [1.2]'s PRF is created by applying P_hash to the secret as:

     PRF(secret, label, seed) = P_<hash>(secret, label + seed)

TLS 1.0 and TLS 1.1


Stepping back a previous version, it becomes a little more complicated. Instead of just using SHA256 as the HMAC, both TLS 1.0 and 1.1 use MD5 and SHA-1 and XORs their results with the premise that if one of the two MACs become compromised, the security integrity remains. It takes an additional step by splitting the provided secret (the pre-master secret in our case) using one half in one HMAC, and the other half of the secret in the other HMAC. As outlined in the RFC:
  TLS [1.1]'s PRF is created by splitting the secret into two halves and
  using one half to generate data with P_MD5 and the other half to
  generate data with P_SHA-1, then exclusive-ORing the outputs of these
  two expansion functions together.

  S1 and S2 are the two halves of the secret, and each is the same
  length.  S1 is taken from the first half of the secret, S2 from the
  second half.

Ultimately, using the same PRF extension and iteration logic that was continued into TLS 1.2:
  The PRF is then defined as the result of mixing the two pseudorandom
  streams by exclusive-ORing them together.

      PRF(secret, label, seed) = P_MD5(S1, label + seed) XOR
                                 P_SHA-1(S2, label + seed);

There will need to be more iterations of P_MD5 due to MD5's shorter output length relative to SHA-1 (16 bytes versus 20 bytes).

SSL 3.0


In SSL 3.0, things were pretty simple. Unlike TLS 1.0 and later, SSL 3.0 did not perform recursive iterations by using the output of the previous hash as input into the next iteration. Also, only MD5 was used instead of a dual approach:
  For Diffie-Hellman, RSA, and FORTEZZA, the same algorithm is used to
  convert the pre_master_secret into the master_secret.  The
  pre_master_secret should be deleted from memory once the
  master_secret has been computed.

      master_secret =
        MD5(pre_master_secret + SHA('A' + pre_master_secret +
            ClientHello.random + ServerHello.random)) +
        MD5(pre_master_secret + SHA('BB' + pre_master_secret +
            ClientHello.random + ServerHello.random)) +
        MD5(pre_master_secret + SHA('CCC' + pre_master_secret +
            ClientHello.random + ServerHello.random));

That's it.

SSL 2.0


With SSL 2.0, there are many more additional steps before the client and server end up with the proper keying material. The "master-key" (as it's referred to in the draft standard) acts more like subsequent versions' "pre-master secret" in that it's a value the client generates and sends to the server:
  The client sends this message when it has determined a master key for the
  server to use. Note that when a session-identifier has been agreed upon,
  this message is not sent.

  The CLEAR-KEY-DATA contains the clear portion of the MASTER-
  KEY. The CLEAR-KEY-DATA is combined with the SECRET-KEY-
  DATA (described shortly) to form the MASTER-KEY, with the
  SECRET-KEY-DATA being the least significant bytes of the final
  MASTER-KEY. The ENCRYPTED-KEY-DATA contains the secret
  portions of the MASTER-KEY, encrypted using the server's public key.

The traditional "master secret" is known as the "key-material" within the standard, and is generated by performing a single MD5 hash on the effective pre-master secret. Note that it performs the hash slightly different depending on the cipher, depending on whether the cipher uses multiple keys but mostly varies on which set of bytes from the keying material to use:
  SSL_CK_RC4_128_WITH_MD5
  SSL_CK_RC4_128_EXPORT40_WITH_MD5
  SSL_CK_RC2_128_CBC_WITH_MD5
  SSL_CK_RC2_128_CBC_EXPORT40_WITH_MD5
  SSL_CK_IDEA_128_CBC_WITH_MD5

        KEY-MATERIAL-0 = MD5[ MASTER-KEY, "0", CHALLENGE,
        CONNECTION-ID ]
        KEY-MATERIAL-1 = MD5[ MASTER-KEY, "1", CHALLENGE,
        CONNECTION-ID ]

  SSL_CK_DES_64_CBC_WITH_MD5

        KEY-MATERIAL-0 = MD5[ MASTER-KEY, CHALLENGE,
        CONNECTION-ID ]

  SSL_CK_DES_192_EDE3_CBC_WITH_MD5

        KEY-MATERIAL-0 = MD5[ MASTER-KEY, "0", CHALLENGE,
        CONNECTION-ID ]
        KEY-MATERIAL-1 = MD5[ MASTER-KEY, "1", CHALLENGE,
        CONNECTION-ID ]
        KEY-MATERIAL-2 = MD5[ MASTER-KEY, "2", CHALLENGE,
        CONNECTION-ID ]

The CHALLENGE is sent by the Client to the Server in the Client Hello, and the CONNECTION-ID is sent by the Server to the Client in the Server Hello; These are the precursors to what would evolve into the Client Hello and Server Hello randoms.
  • TLS 1.2 uses SHA256 as the HMAC in the PRF.
  • TLS 1.1 and TLS 1.0 uses a combination of MD5 and SHA-1 HMACs in the PRF by splitting the secrets into halves, one for each HMAC, and ultimately XORing the output at the end.
  • SSL 3.0 uses only MD5, with 3 non-recursive concatenated iterations.
  • SSL 2.0 uses only MD5, with 1 hash iteration.
So we learned that the protocols specified the PRF used despite the cipher suite chosen. Of course, thus brought up a subsequent question:
  • Could a cipher suite added in TLS 1.0 generate a master secret using anything other than the TLS 1.0 defined PRF?
Initial investigation appears this is possible. Ciphers introduced in early TLS protocols and still supported in later protocols can be negotiated in those protocols.
  • Which cipher suites actually change the pseudorandom function / HMAC used to generate the master secret?
That currently remains unclear. I'm sure that are some cipher suites that do, but an initial investigation did not reveal that any of the popular standardized ciphers deviated from protocol defaults.

More information about the above topics can be found from their associated links contained within the article, in addition the following documents were referenced: