Monday, December 31, 2018

Missing console warning with UEFI and Xen

I recently begun building out two servers with the objective to run various docker-wrapped applications with CoreOS and VMs within Xen on top of debian-stable. The motherboards are from ~2013 and have full support for UEFI. I surprisingly hadn't done a UEFI-enabled installation previously, or at least one that I can explicitly recall, so I didn't think much about deciding whether to use UEFI or do a legacy boot installation. I threw the debian-stable ISO on a USB stick, plugged it, and booted it up.

All went well in the initial installation, until I installed the Xen kernel and attempted to boot into it, and was left with the curious warning on my VGA console (versions not accurate as I retroactively grabbed this from someone else's report, but the rest is representative):

Loading Xen 4.9-amd_64 ...
WARNING: no console will be available to OS
Loading Linux 4.15.0-20-generic ...
Loading initial ramdisk ...

After that, the system just hung. The system didn't appear to have completed booting, the network stack never came online, nor could I fish for the device on the local network and attempt to remotely connect to it. It was just frozen. If I rebooted it into the default debian-stable kernel, it had no issue.

I poked around the Internet and quickly learned that this was an issue between UEFI, Xen, and the seemingly inability to set proper default grub2 settings. I found numerous questions along similar lines going back to 2013, although curiously all of them had different resolutions, and there wasn't a consistent solution or patch to address what amounted to a rather vague fatal warning being thrown into the console. Some of them revolved around improper memory settings for the framebuffer, while others seemed to be as simple as specifying the proper console option within the multiboot command in grub2.

I tried some of the more obvious solutions but was not running into much luck ultimately allowing the system to boot. I did not want to spend too much time investigating this considering I knew I could just switch to Legacy BIOS instead of using UEFI as I did not need to support multi-booting multiple modern operating systems, let alone any other operating systems. There is also a current report open for this issue for xen-system-amd64, although it doesn't appear to be getting much attention considering both individuals were able to workaround the issue.

Without much additional thought I decided to just wipe the system and reinstall with the intention to use legacy BIOS. For some reason I made the assumption that the installer would provide a choice on the type of boot installation method the user would like, but it just proceeded with a grub-install without prompting which target. So while I had the goal in mind to do a legacy BIOS installation, it was just assuming the boot method preference by the method for which I booted into the installer, and thus also using the UEFI-specific grub packages corresponding to that method.

I had found it curious that it threw a warning about a missing EFI partition, and then wasn't surprised when the grub-install ultimately failed. I let it proceed and immediately rebooted into rescue mode to repair the boot partition and saw the EFI grub packages installed but using a target of i386-pc. It was then that I came to a realization that it installed the respective packages based on the installer boot method. Instead of messing with removing and reinstalling the appropriate grub packages, toying with fdisk, and given the base install is quite snappy, I booted into the installer via legacy BIOS this time and installation was seamless. Installed the Xen kernel and had no

I am still a little perplexed on what the cause of the original incompatibility with my system, UEFI, and Xen is since I'd prefer not to fallback on an old technology to remediate an unknown issue involving a current one. Perhaps I'll explore more when I set up the second system.

Wednesday, March 4, 2015

Master Secret Generation in TLS/SSL


After reading about the recent FREAK vulnerability and how an attacker could cause a client to happily accept a session using an export cipher suite, even though it did not even offer it in the Client Hello, I was curious how the master secret was generated for these sessions, and if it was any different from any of the more common cipher suites used today. Even more troulbing, it was revealed in a number of articles through some research of the vulnerability that a number of cryptographic systems would re-use the generated 512-bit keys for multiple sessions with different individuals when using one of the many export cipher suites. In context of the vulnerability, these generated keys are used in key exchange, and thus if an attacker was able to factor the key, it would reveal (and subsequently be able to tamper with) the pre-master secret, and thus ultimately the master secret.

In one of my previous articles outlining what is necessary to decrypt a TLS/SSL session that has already completed, I revealed that the master secret is generated by passing the pre-master secret (generated by the client), a non-null terminated static label ("master secret"), and the random values from the Client Hello and Server Hello into a pseudorandom function defined by the protocol:
   master_secret = PRF(pre_master_secret, "master secret",
                       ClientHello.random + ServerHello.random)
                       [0..47];

Much like the previous post, there's a question I wanted answered that started this article:
  • How does the client and server know which pseudorandom function ("PRF") to use? Is it dependent on the cipher suite chosen by the server during session negotiation?
We know how to interpret the structure of a cipher suite, such as TLS_RSA_WITH_AES_128_CBC_SHA:
  • This cipher suite was introduced in TLS 1.0 or later.
  • It specifies to use RSA for key exchange and authentication.
  • It specifies to uses the chain-block cipher AES for bulk symmetric encryption, with a 128-bit key.
  • Lastly, it cites SHA-1 for MAC calculation.
But no where does it state what HMAC is used for the PRF when the Client and Server calculate the master secret.

What I discovered is that it actually has to do more with the TLS protocol in use than the cipher suite chosen in most cases. However, TLS 1.2 adds the capability for it to be defined in the cipher suite.

Note that we're looking at the PRF in the context of the generation of the master secret, but the PRF is used beyond just the calculating of the master secret. It's worthwhile to mention that the master secret is not the final keying material used for the symmetric cipher. TLS takes the additional step of passing the master secret through the defined PRF to calculate the respective keys. SSL 2.0 simply slices what would be our master secret. SSL 2.0 and 3.0 do not have the concept of a defined PRF, but I cited what is the effective equivalent in context of the creation of the master secret.

Let's discuss how the PRF evolved through the TLS versions, starting with the most recent.

TLS 1.2


The PRF in TLS 1.2 uses the SHA256 HMAC over a number of iterations to ultimately generate a necessary number of desired bytes, as described in the RFC:
  This PRF with the SHA-256 hash function is used for all cipher
  suites defined in this document and in TLS documents published
  prior to this document when TLS 1.2 is negotiated.  New cipher
  suites MUST explicitly specify a PRF and, in general, SHOULD use the
  TLS PRF with SHA-256 or a stronger standard hash function.

  First, we define a data expansion function, P_hash(secret, data),
  that uses a single hash function to expand a secret and seed into an
  arbitrary quantity of output:

     P_hash(secret, seed) = HMAC_hash(secret, A(1) + seed) +
                         HMAC_hash(secret, A(2) + seed) +
                         HMAC_hash(secret, A(3) + seed) + ...

  where + indicates concatenation.

  A() is defined as:

     A(0) = seed
     A(i) = HMAC_hash(secret, A(i-1))

  P_hash can be iterated as many times as necessary to produce the
  required quantity of data. 

Now that the hashing function is defined, which you'll see is common in historical versions of TLS, the PRF can now be defined as follows, which is unique to TLS1.2:
  TLS [1.2]'s PRF is created by applying P_hash to the secret as:

     PRF(secret, label, seed) = P_<hash>(secret, label + seed)

TLS 1.0 and TLS 1.1


Stepping back a previous version, it becomes a little more complicated. Instead of just using SHA256 as the HMAC, both TLS 1.0 and 1.1 use MD5 and SHA-1 and XORs their results with the premise that if one of the two MACs become compromised, the security integrity remains. It takes an additional step by splitting the provided secret (the pre-master secret in our case) using one half in one HMAC, and the other half of the secret in the other HMAC. As outlined in the RFC:
  TLS [1.1]'s PRF is created by splitting the secret into two halves and
  using one half to generate data with P_MD5 and the other half to
  generate data with P_SHA-1, then exclusive-ORing the outputs of these
  two expansion functions together.

  S1 and S2 are the two halves of the secret, and each is the same
  length.  S1 is taken from the first half of the secret, S2 from the
  second half.

Ultimately, using the same PRF extension and iteration logic that was continued into TLS 1.2:
  The PRF is then defined as the result of mixing the two pseudorandom
  streams by exclusive-ORing them together.

      PRF(secret, label, seed) = P_MD5(S1, label + seed) XOR
                                 P_SHA-1(S2, label + seed);

There will need to be more iterations of P_MD5 due to MD5's shorter output length relative to SHA-1 (16 bytes versus 20 bytes).

SSL 3.0


In SSL 3.0, things were pretty simple. Unlike TLS 1.0 and later, SSL 3.0 did not perform recursive iterations by using the output of the previous hash as input into the next iteration. Also, only MD5 was used instead of a dual approach:
  For Diffie-Hellman, RSA, and FORTEZZA, the same algorithm is used to
  convert the pre_master_secret into the master_secret.  The
  pre_master_secret should be deleted from memory once the
  master_secret has been computed.

      master_secret =
        MD5(pre_master_secret + SHA('A' + pre_master_secret +
            ClientHello.random + ServerHello.random)) +
        MD5(pre_master_secret + SHA('BB' + pre_master_secret +
            ClientHello.random + ServerHello.random)) +
        MD5(pre_master_secret + SHA('CCC' + pre_master_secret +
            ClientHello.random + ServerHello.random));

That's it.

SSL 2.0


With SSL 2.0, there are many more additional steps before the client and server end up with the proper keying material. The "master-key" (as it's referred to in the draft standard) acts more like subsequent versions' "pre-master secret" in that it's a value the client generates and sends to the server:
  The client sends this message when it has determined a master key for the
  server to use. Note that when a session-identifier has been agreed upon,
  this message is not sent.

  The CLEAR-KEY-DATA contains the clear portion of the MASTER-
  KEY. The CLEAR-KEY-DATA is combined with the SECRET-KEY-
  DATA (described shortly) to form the MASTER-KEY, with the
  SECRET-KEY-DATA being the least significant bytes of the final
  MASTER-KEY. The ENCRYPTED-KEY-DATA contains the secret
  portions of the MASTER-KEY, encrypted using the server's public key.

The traditional "master secret" is known as the "key-material" within the standard, and is generated by performing a single MD5 hash on the effective pre-master secret. Note that it performs the hash slightly different depending on the cipher, depending on whether the cipher uses multiple keys but mostly varies on which set of bytes from the keying material to use:
  SSL_CK_RC4_128_WITH_MD5
  SSL_CK_RC4_128_EXPORT40_WITH_MD5
  SSL_CK_RC2_128_CBC_WITH_MD5
  SSL_CK_RC2_128_CBC_EXPORT40_WITH_MD5
  SSL_CK_IDEA_128_CBC_WITH_MD5

        KEY-MATERIAL-0 = MD5[ MASTER-KEY, "0", CHALLENGE,
        CONNECTION-ID ]
        KEY-MATERIAL-1 = MD5[ MASTER-KEY, "1", CHALLENGE,
        CONNECTION-ID ]

  SSL_CK_DES_64_CBC_WITH_MD5

        KEY-MATERIAL-0 = MD5[ MASTER-KEY, CHALLENGE,
        CONNECTION-ID ]

  SSL_CK_DES_192_EDE3_CBC_WITH_MD5

        KEY-MATERIAL-0 = MD5[ MASTER-KEY, "0", CHALLENGE,
        CONNECTION-ID ]
        KEY-MATERIAL-1 = MD5[ MASTER-KEY, "1", CHALLENGE,
        CONNECTION-ID ]
        KEY-MATERIAL-2 = MD5[ MASTER-KEY, "2", CHALLENGE,
        CONNECTION-ID ]

The CHALLENGE is sent by the Client to the Server in the Client Hello, and the CONNECTION-ID is sent by the Server to the Client in the Server Hello; These are the precursors to what would evolve into the Client Hello and Server Hello randoms.
  • TLS 1.2 uses SHA256 as the HMAC in the PRF.
  • TLS 1.1 and TLS 1.0 uses a combination of MD5 and SHA-1 HMACs in the PRF by splitting the secrets into halves, one for each HMAC, and ultimately XORing the output at the end.
  • SSL 3.0 uses only MD5, with 3 non-recursive concatenated iterations.
  • SSL 2.0 uses only MD5, with 1 hash iteration.
So we learned that the protocols specified the PRF used despite the cipher suite chosen. Of course, thus brought up a subsequent question:
  • Could a cipher suite added in TLS 1.0 generate a master secret using anything other than the TLS 1.0 defined PRF?
Initial investigation appears this is possible. Ciphers introduced in early TLS protocols and still supported in later protocols can be negotiated in those protocols.
  • Which cipher suites actually change the pseudorandom function / HMAC used to generate the master secret?
That currently remains unclear. I'm sure that are some cipher suites that do, but an initial investigation did not reveal that any of the popular standardized ciphers deviated from protocol defaults.

More information about the above topics can be found from their associated links contained within the article, in addition the following documents were referenced:

Saturday, May 24, 2014

Exploring DNS - Part 1: The Basics


dig is a utility commonly found on Linux and OS X platforms and is part of the BIND software suite. You'll have to download it separately if you're using Windows. It has since become a staple Domain Name System (DNS) client utility since its introduction. This will be a three part (at minimum) series exploring the DNS infrastructure that is crucial to the functioning of nearly every Internet service available while learning the capabilities of dig at the same time. DNS' criticality is obvious when the root cause of someone's Internet not working is that some node that handles DNS transactions is not responsive or misconfigured, emulating the behavior of having not having accessible Internet.

For many that are familiar with DNS, most know of its functionality to find the IP address of a specific domain, except there's much more than that. This will allow the computer you're using and network it's on to find and talk to the resource. These transactions happen automatically, and there's rarely a need to do a manual query, except when things do not work the way you want. This article discusses the variety of things that can be discovered about a network through DNS queries.

The article attempts to tailor to both a non-technical and technical crowd, hopefully allowing those unfamiliar with certain terminology to gloss over them and still understand the message of a particular section.


A and PTR Records


A domain can often be thought of for many users as simply a friendly name for an IP address. If you wanted to browse to "www.google.com", it'd be a lot easier typing that than memorizing and typing "74.125.226.208" into your browser each time. Administrators define what these friendly names translate to as "A" records in their domain's zone configuration on the authoritative nameservers for that domain (we'll talk more about those later).

To perform a simple query for a domain's IP address (also known as an A record lookup), you simply specify the domain and type as follows:
 dig example.com A

Executing this command will cause your computer to ask the configured DNS server on your local system to find what the A record is for the domain "example.com". Depending on your network configuration, this is something that's statically configured by you, an Administrator, or through an automated deployment process. Otherwise, it's often provided automatically as part of DHCP via a configuration option. Most often your local DNS server will not have the record you're looking for cached in its system, so it will have to go through a series of steps querying authoritative nameservers on your behalf to ultimately find the answer to your initial query.

In our example, let's pretend "example.com" resolves to "93.184.216.119". You'll receive this answer with what appears to be a lot of other useless information, but is a parsed representation of the received answer:
 ; <<>> DiG 9.8.3-P1 <<>> example.com A
 ;; global options: +cmd
 ;; Got answer:
 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26804
 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

 ;; QUESTION SECTION:
 ;example.com.                   IN      A

 ;; ANSWER SECTION:
 example.com.            1734    IN      A       93.184.216.119

 ;; Query time: 18 msec
 ;; SERVER: 75.75.75.75#53(75.75.75.75)
 ;; WHEN: Tue Jan  7 23:16:14 2014
 ;; MSG SIZE  rcvd: 45

If you only wanted the answer to the query (the IP address), and could not care about the rest of it, you can use some of dig's filtering options to get the exact information you want.
 dig example.com A +short
In this case, you'll simply get the list of IPs (or IP) back, each on its own line.

What if you had the IP address but wanted to know what the IP address resolves to, or what's commonly known as a reverse lookup? This information is defined as a different record type, or query type. Instead of querying for the "A" record of that domain, we will be querying for the "PTR" recordof the IP address. There are two ways you can do this, either through a shortcut or a verbose query mimicking how it actually is sent out on the wire.
 dig -x 93.184.216.119

The '-x' is a shortcut to do a reverse lookup. To do this explicitly:
 dig 119.216.184.93.in-addr.arpa PTR

You might've noticed that what we queried for does not look like an IP address, and in fact it's also reversed. The way nameservers discover who is responsible, or authoritative, for a particular domain is a fundamental part that makes DNS work. For a standard domain lookup, your caching nameserver will take the domain, "example.com.", and find out who is authoritative for each successive word from the right to left delimited by the 'dot'. In this case, it'll ask the DNS root-servers who is authoritative for the "com" (top-level) domain. Once it knows who is (and also caches that answer as defined by the TTL so it does not have to query the root-servers again), it sends a request to those servers asking who is authoritative for the "example" domain. Once it finds out those nameservers, it'll direct its specific request, the original query, to that authoritative nameserver ultimately responding with our desired answer. PTR record lookups work the same way for the same reason.

As certain individuals or organizations are responsible or authoritative for domains, they can also be authoritative for IP addresses, and there must be a way to represent that. One organization is responsible for "google.com" and another for "apple.com", and the same can be said about the 4.0.0.0/8 IP CIDR and 24.86.1.0/24 CIDR. Thus, the reversing of the IP address and suffixing it with a special top-level domain (.in-addr.arpa for IPv4 space) causes the IP address to be looked up in the same way as a domain allowing proper delegation to lower level organizations. For example, Verizon might be the owner for 24.97.0.0/16, but Verizon might route and give authority for 24.97.2.0/24 to one of their business customers, and thus allow that customer to define how that IP space resolves. There are certain services that require a circular records to be in place for a set of domains and IP addresses, meaning a domain that resolves to an IP address as defined by its "A" record, must have a "PTR" record on that IP address pointing back to the domain.

CNAME


Not all domains have to resolve directly to IP addresses. If I hosted a webserver which contained many different websites on many different domains at a specific IP address, it would be an unmanageable process for me (or the people who own domains but choose to host their site on my server) to update their A records if the IP address of my server ever changed, or if I wanted to introduce DNS-based load-balancing. To ease this process, authoritative nameservers can point the requesting client to another domain as an answer to its query. To show a specific example, let's show the answer of querying for www.apple.com's A record,
 dig www.apple.com A +noall +answer

Leads us to the following chain of answers:
 www.apple.com.          1306    IN      CNAME   www.isg-apple.com.akadns.net.
 www.isg-apple.com.akadns.net. 39 IN     CNAME   www.apple.com.edgekey.net.
 www.apple.com.edgekey.net. 550  IN      CNAME   e3191.dscc.akamaiedge.net.
 e3191.dscc.akamaiedge.net. 19   IN      A       23.48.61.15

Apple's authoritative nameserver redirected the client via a CNAME answer to an Apple-specific subdomain on 'akadns.net' (Akamai), which is a third-party content delivery organization. Akamai's authoritative nameserver then leads through two more CNAME records before ultimately resolving to an IP address that the client can request the web page from. You'll notice a number after each of the domains. This is called the TTL value, or Time To Live, and specifies how long other nameservers and your client should cache that record since the authoriatitive nameserver does not believe it will change during that period, and if it does, the TTL is an acceptable enough period that your computer will ultimately find the right answer after a reasonable amount of time. This is a value configured and chosen by Administrators.

DNS is not limited to simply resolving domains to IP address or vice versa, although clearly that's its predominant use for end-users. There have been many other uses for DNS implemented since its inception, and that's what we'll start taking a look at now.

MX Records


At this point, every one using the Internet has likely sent an e-mail. Each e-mail is suffixed with the domain of the hosting company or organization that represents the "destination" the mail should be delivered to. If Alice sends an e-mail to Bob, once Alice writes an e-mail addressed bob@gmail.com and hits the send button, the e-mail is handed off to Alice's e-mail server. The e-mail server doesn't know what IP address to send the e-mail address at this point. Through our previous example, we demonstrated we could find the IP address that resolves from the domain by using an "A" record query, but it may not necessarily be the resource also handling receiving mail for that domain. Thus, a special query type was specified to find the "Mail Exchange" servers for a particular domain, the MXrecord.

It ultimately looks very much like a CNAME answer in that its response contains other domains, with the exception that multiple answers are to be expected and the priority of the mail servers used are specified.

If we wanted to figure out gmail.com's receiving mail servers that Alice's e-mail server should be sending the mail to, we can issue the following command:
 dig gmail.com MX +short

That responds with:
 30 alt3.gmail-smtp-in.l.google.com.
 20 alt2.gmail-smtp-in.l.google.com.
 40 alt4.gmail-smtp-in.l.google.com.
 10 alt1.gmail-smtp-in.l.google.com.
 5 gmail-smtp-in.l.google.com.

Since highest priority is specified by lowest number, Alice's mail server would issue another DNS request to find the IP address ofgmail-smtp-in.l.google.com via an "A" record lookup so it can initiate the SMTP connection and send the mail on its way.

These four record types, A, PTR, CNAME, and MX are the most common record types, and perhaps the most frequently queried. However, there are other records that play a prominent role in many services, for example TXT records and their common use for SPF and DKIM. In the next part, we'll dive into TXT records and its common uses. Subsequent posts will take a look into the SOA record and what it can reveal, response codes, proper and improper DNSSEC configuration, AXFR and IXFRs, and some of the more extravagant record types.

Sunday, May 18, 2014

Deffie-Hellman Key Exchange

During my investigation in decrypting TLS/SSL streams, I was slightly irritated by the lack of clear distinction between the various Diffie-Hellman key exchange types. I recognized one type provided perfect forward secrecy, but it was unclear why the others did not. The way the Diffie-Hellman Wikipedia article was written, it seemed like these values would be generated on the fly in all cases. Questions were coming up in my head that I could not get answered, such as:
  • What is the source material for each Diffie-Hellman type generated from?
  • What pseudorandom function is used to generate those random values? Is permanently stored by the end in some DH types?
  • What were the actual distinctions between the TLS_DH_* and TLS_DHE_* ciphers in OpenSSL?
  • When is RSA public key cryptography used with Diffie-Hellman, with or without authentication?
  • Is public-key cryptography with Diffie-Hellman directly used in the transaction for both authentication and determination of the symmetric key in some types?
If you came to this article after reading the abstracted Diffie-Hellman Wikipedia article on the exchange mechanics, you might be confused a bit by where some of the information originates from. In the most practical case of using Diffie-Hellman, within the SSL/TLS negotiation phase, the server is responsible for determining the public parameters for use in that session (for the case where they are not static within a certificate).

Lets take a look at the three types of Diffie-Hellman:

Fixed Diffie-Hellman

Represented as the TLS_(EC)DH_* ciphers in OpenSSL.

The server cryptographic material intended for the client is explicitly contained within the server certificate. The client parameters will be contained with the client certificate, or the client will send it in a subsequent Client Key Exchange if it does not have a certificate. Due to this, this cipher is not considered to be anonymous, as the necessary cryptographic material to establish a master secret is derived from the material contained in at least the server certificate, or the client and server certificates. The Diffie-Hellman parameters (group and generator) must match between the client and server certificates. The keyAgreement bit on the client and server certificate must be set. If a client certificate and server certificate are used in the key exchange, they should consistently result in the same pre-master secret being calculated.

Ephemeral Diffie-Hellman

Represented as the TLS_(EC)DHE_* ciphers in OpenSSL.

This cipher is not an anonymous type, and the ability to verify the authenticity of the server must be provided through the use of an RSA or DSA certificate. A Server Key Exchange message is necessary to give the client material to complete the exchange as the parameters or the public are not contained within the RSA or DSA certificate. The Server Key Message includes what prime modulus and generator (p, g) will be used for the key exchange by the server and client, in addition to the server's calculated public value. The server also includes a signed hash composed of these parameters and the Client and Server's Hello random numbers in order to authenticate itself to the client. The client will also provide a Client Key Exchange message, after sending its own RSA or DSA certificate if requested, containing its own calculated public value. The client uses the Client Verify message to authenticate itself as it does in straight RSA key exchange.

Anonymous Diffie-Hellman

Represented as the TLS_(EC)DH_anon_* ciphers in OpenSSL.

The only case where the server must not send a certificate to the client as the protocol is designed to be a vanilla Diffie-Helman exchange. A Server Key Exchange message is necessary to give the client material to complete the exchange: the prime modulus, generator, and the server's public value. The client will also provide a Client Key Exchange message containing its calculated public value. As stated in many explanations, this is easily susceptible to Man-in-the-Middle attacks, so its use is often not recommended.



More information about the above topics can be found from their associated links contained within the article, in addition the following articles were referenced:

Saturday, May 3, 2014

TLS/SSL Client Certificates and Stream Decryption

I recently ran into a situation where I wanted to decrypt a TLS/SSL stream, but I only had the client certificate private key available and not the server certificate private key -- quite a common situation when client certificate authentication happens to be required! I was under the incorrect assumption that a key exchange protocol had been used during the transaction where the client used its private key as a source for the resulting master secret. I have come to learn that this was not the case, and it is here that I will describe why.

For those unfamiliar, TLS/SSL is an application layer protocol that uses both asymmetric and symmetric encryption to secure the data being transmitted. For the majority of ciphers, asymmetric encryption is briefly used to generate a shared master secret (and/or authenticate the endpoints if appropriate), which is then used in the symmetric encryption portion of the client/server exchanges to exchange application data back and forth. There are many libraries available for application developers to use, but more often on one simply provided by the operating system. To be able to decrypt a stream and view the contents of the exchange, one must have the shared master secret used in the symmetric encryption of that data. To get that shared master secret after an exchange has completed, and all your left with is a pcap file of the transactions after the session has terminated, you need to recalculate it based on the information exchanged during key agreement of the handshake, which includes the pre-master secret that the master secret is derived from. In the case of simple RSA key agreement, since the server public and private key pairs of the endpoints are used as part of this process to generate the master secret, they are necessary in order to recalculate it.

I only had the client certificate private key and at first, my investigation started with exploring the problem on why Wireshark was attempting to associate the client certificate private key I provided within the SSL protocol preferences to only the server certificate in the handshake, and did not attempt to apply it to the client certificate even though I used the IP address of the client when making the private key available to Wireshark. Some further investigation led me to realize that the client certificate public or private keys are not used in the generation of the master secret used for the transaction.

So this lead me to the following two questions:
  • Are there any key exchange protocols that use the client certificate as part of the key generation material? 
  • For plain RSA, what's the process used to generate the pre-master secret?
It's best to step back and walk through the negotiation phase of the SSL handshake to understand how the client and server determine the final master secret. The full handshake is depicted as such in the TLS RFC:

      Client                                             Server
      ------                                             ------

      ClientHello             --------> 
                                                    ServerHello
                                                   Certificate*
                                             ServerKeyExchange*
                                            CertificateRequest*
                              <--------         ServerHelloDone
      Certificate*
      ClientKeyExchange
      CertificateVerify*
      [ChangeCipherSpec]
      Finished                -------->
                                             [ChangeCipherSpec]
                              <--------                Finished
      Application Data        <------->        Application Data

             Figure 1.  Message flow for a full handshake

   * Indicates optional or situation-dependent messages that are not
   always sent.
During the Client Hello, the endpoint proposes all the ciphers it supports to the server. The server in turn then selects only one of those and advertises it in its Server Hello. At this point, both client and server know which cipher suite to use. If the server did not like any of the proposed ciphers by the client, it could immediately send an alert and end the connection, leaving the client out of luck. With that though, we have to make an addendum to my earlier comment on how TLS/SSL uses both asymmetric encryption and symmetric encryption to protect data. TLS can be broken into three components:
  • Key Exchange 
  • Privacy 
  • Integrity
These three components can be seen in the offered cipher suites by the client, which again can be easily seen using Wireshark or by taking looking a look at the TLS/SSL RFCS or OpenSSL's supported cipher list. Let's use one particular, common example: TLS_RSA_WITH_AES_128_CBC_SHA. This tells us multiple things:
  1. This cipher suite was introduced in TLS 1.0 or later. 
  2. It specifies to use RSA for asymmetric encryption, and in this case the key exchange as well. 
  3. It specifies to uses the chain-block cipher AES for symmetric encryption, with a 128-bit key. 
  4. Lastly, it cites SHA-1 for MAC calculation.
We care about the Key Exchange portion here. The handshake I described above takes place when using RSA, which in this context is considered an authenticated key exchange protocol using public key cryptography. There are other key exchange protocols, including the unauthenticated and authenticated Diffie-Hellman protocol, but we'll focus on plain RSA here. Since this example cipher does not utilize an anonymous key exchange protocol, the server must send the client a certificate after the Server Hello whose type is appropriate for the determined cipher. The server will not send over a Server Key Exchange message as the client already has the public key of the server for use in exchanging the determined pre-master secret, and confirming that the server owns the private key to that server certificate. A Server Key Exchange message is necessary for key exchange protocols that are not using the public keying material in the described behavior, which includes almost all iterations of Diffie-Hellman (except when the material is included in the certificates, as is the case for Fixed Diffie-Hellman).

After the Server Hello, the server may send a Certificate Request message to the client, instructing it to send the server the client's own certificate if it has one present. Just like the server certificate, this client certificate has an appropriate private and public key, which will be used shortly by the server to ensure the client actually has the private key associated with that certificate. If the client does not have an available certificate, it will not send anything (in TLS 1.2 the client will send a Certificate message with an empty certificate list), at which point the server can treat this as a fatal error or continue with the connection. Its fairly rare to find a deployment of HTTPS (HTTP over SSL/TLS) to require a client certificate (if the cipher does not require it), but the client certificate requirement is much more common in applications that implement SSL/TLS to conduct secure, dual-authenticated client-server transactions. The server finishes its first part with a Server Done message.

As mentioned, if the client has a certificate, it will now send it over to the server. The next expected message sent by the client is the important Client Key Exchange message, which is always going to be sent, even though the contents of it may not be necessary. In the case of Fixed Diffie-Hellman, this information could be contained in the certificate the client just sent to the server so the Client Key Exchange message may be left blank. For all others, this message is either going to contain the RSA-encrypted pre-master secret or the client's Diffie-Hellman's parameters.

The RSA pre-master secret is a 48 byte value, generated by client, composed of the 2 byte client version found within the Client Hello and 46 (pseudo)random bytes. So how is the remaining 46 byte stream created? Well, it's entirely up to the SSL/TSL library the client application is using, unless the client implemented the SSL/TLS protocol itself. Here is the pre-master secret generation code snippet from OpenSSL 1.0.1c:

       unsigned char tmp_buf[SSL_MAX_MASTER_KEY_LENGTH];
       ...
       tmp_buf[0]=s->client_version>>8;
       tmp_buf[1]=s->client_version&0xff;
       if (RAND_bytes(&(tmp_buf[2]),sizeof tmp_buf-2) <= 0)
           goto err;

OpenSSL relies on its own RAND_bytes function to generate a stream of 46 bytes if its using its provided engine. If best practices are followed correctly, the library used (or the entity that implemented SSL/TLS) is to delete the pre-master secret from memory as soon as the master secret is calculated. By the time this occurs, the pre-master secret has been encrypted using the server's public key and sent out on the wire, which means at this point only the server can decrypt the contents of the Client Key Exchange using its private key due to the properties of asymmetric encryption.

If the client sent a certificate to the server with a digitalSignature bit enabled (which is required for all cases except Fixed Diffie-Hellman), it will also send a Certificate Verify message that will be a signed hash of all messages exchanged between the client and server prior to the forthcoming message and including the Client Hello. A Change Cipher Spec message follows from the client, as well as a Finished message that acts as a test of the negotiated master secret determined during the handshake. Much like the Certificate Verify message, this is a hash of all exchanges between the client and server excluding the current message and the required prior Change Cipher Spec message, which is run through the pseudo-random function defined by the cipher using the negotiated master secret. The client or server may begin sending encrypted data back and forth once they have received and validated the Finished message from the other endpoint.

The master secret is derived the same way for RSA and Diffie-Hellman key exchanges. The only difference between those two is how the pre-master secret is calculated. For all master secrets, as specified in the TLS 1.0 and 1.2 RFC:

       master_secret = PRF(pre_master_secret, "master secret",
                           ClientHello.random + ServerHello.random)
                           [0..47];

The last integer refers to a random number that was included in both the Client and Server Hello messages, with the bracketed material stating that it will be 48 bytes. The PRF cited is outlined in the TLS 1.0 and TLS1.2 RFCs.

With this new information, let's revisit the earlier questions:
  • Are there any key exchange protocols that use the client certificate as part of the key generation material?
Yes, but only one type: Fixed Diffie-Hellman. However, a client certificate is not a necessary required part of the key exchange (the client will send its public value in a Client Key Exchange message if it does not have a certificate), so the server may not even ask for it depending on the implementation.
  • For plain RSA, what's the process used to generate the pre-master secret? 
This is dependent upon the application itself, and what library it wants to use. It could rely on an operating system provided library, a third-party library it includes in the application, or it could attempt to implement SSL/TLS itself. Time has shown that mistakes are common in the field of cryptography, between researchers finding flaws in ciphers considered cryptographically secure when they were drafted, to over-confident software engineers implementing a layer of cryptography in their application.

This is why you must have the server certificate private key, including the entire handshake in a capture, to decrypt a particular SSL/TLS stream if the session has already completed. A forthcoming article will discuss how to pull a master secret from memory and allow you to decrypt the stream before the session terminates.


More information about the above topics can be found from their associated links contained within the article, in addition the following documents were referenced:
  • RFC2246 - TLS Protocol - Version 1.0 
  • RFC5246 - TLS Protocol - Version 1.2 
  • OpenSSL 1.0.1c source