0:00

In this segment I wanna show you how RSA is used in practice and in particular, I

Â wanna tell you about a standard called a Public Key Cryptography standard number

Â one PKCS one. So I've already told you a couple of times that you should never use

Â what's called textbook RSA. You should never directly encrypt the message using

Â RSA because that's insecure. You always have to do something to the message before

Â you actually apply the RSA function. We saw the ISO standard in the previous

Â segment where what we did is we generated a random x, encrypted x using RSA, and

Â then from this x we derived a symmetric encryption key. But that's actually not how

Â RSA is used in practice. In practice things worked a little differently and

Â typically what happens is the system generates a symmetric encryption key and

Â then RSA is asked to encrypt the given symmetric encryption key rather than

Â generating the symmetric key as part of [our same encryption or RSA encryption?]. So in practice as I

Â say, what happens the RSA system is given this input a symmetric encryption key to

Â encrypt for example this could be a, an AES key that would be like 128 bits and then

Â the way this key is actually encrypted is first we take these 128 bits and we expand

Â them into the full modulo size. For example, this would be 2048 bits, and then

Â we apply the RSA function. And so the question is, how should this preprocessing

Â be done? How should we expand the 128 bit AES key that was given to us into a 2048

Â bit value that can then be applied with RSA. And then furthermore the question is,

Â how do we argue that the resulting system is secure? So in all the way of doing it

Â which is actually very widely deployed in practice is what's called PKCS1 version

Â 1.5, Public Key Cryptography Standard, that's what PKCS stands for. So I wanna

Â show you how this mechanism works and in particular, I'll show you what's called

Â PKCS1 Mode 2. Mode 2 denotes encryption, mode 1 denotes signatures so

Â here we're gonna just focus on the encryption. So the way PKCS1 works is

Â as follows, you take your message, this would be the 128 bit AES key for example,

Â and you put it as a least significant bit of the value that you're creating. The

Â next thing you do is you append sixteen bits of one to it, you know [foreign].

Â This is sixteen bits of one. And then the next you do is you append the random pad

Â that actually does not contain FF anywhere inside the random pad. So this

Â is basically something like on the order of what is it, 1900 random bits except

Â that there are on FF's inside those random bits and finally at the very top, you put

Â the number 02. This indicates that this plain text has been encoded using PKCS1

Â mode 2. And then this whole value here, this whole thing that we just created.

Â This is just 2048 bit string is the thing that gets fed into the RSA function and is

Â raised to the power of e modulo N. And the resulting thing is PKCS1 ciphertext. Now

Â the decrypter of course is going to invert the RSA function, He's gonna get back this

Â block, He's gonna look at the most significant bits and he's gonna say

Â there's a 02 in there that means that this PKCS one formatted. Since it's PKCS one

Â formatted, it's gonna remove those 02. It's gonna remove all the random pad up

Â until the FF. And then anything that stays beyond that is the original

Â message which is then used you know to say decrypt the actual body of the ciphertext.

Â And as I said, this mechanism is extremely widely deployed, For example, it's used in

Â HTTPS. Now interestingly, this PKCS1 version 1.5 was designed in the late 80's.

Â There was really no security proof to argue that this mechanism is in fact

Â secure. And as it turns out, it is very common when there is no security proof, it

Â turns out that actually things break and there's a very, very elegant attack due to

Â Bleichenbacher back in 1998, Daniel Bleichenbacher which shows how to attack

Â PKCS1 when it's used for example inside of HTTPS. So let's see how the attack works.

Â So the idea is this, supposed the attacker intercepted a certain ciphertext so this

Â is PKCS1 ciphertext so the point is it's encoded using PKCS1 and then the

Â result is fed into the RSA function. And we call the ciphertext the output of the

Â RSA function. The attacker wants to basically decrypt the ciphertext, So let

Â me show you abstractly what the attacker is gonna do. We're gonna just simplify SSL

Â and just say that the attacker can basically send the ciphertext directly to

Â the web server. The web server is gonna try and decrypt the ciphertext using its

Â secret key. And then what is it gonna do? Well, the first thing it does after the

Â encryption, well, it's gonna ask: is the decryption of the ciphertext PKCS1

Â encoded? In other words, it's gonna look at the most significant bits and ask: is

Â this 02 in the most significant positions? If they are, it's gonna continue properly

Â decrypting and then continue with the protocol and if there is no 02 in those

Â positions, it's gonna announce an error. So if the most significant bits of the

Â plaintext are 02, it's gonna continue with the protocol as expected, if the most

Â significant bits are not 02 is gonna send back an error message and tell the

Â attacker, hey, you sent in invalid ciphertext. Now the amazing thing is that,

Â this lets the attacker test whether the sixteen most significant bits of the

Â decryption of the given ciphertext are 02 or not. In other words, the attacker can

Â submit whatever ciphertext he wants to the web server. The web server will invert the

Â RSA function and then tell the attacker whether the inversion started with 02 or

Â not. So in some sense gives the attacker an oracle that lets him test whether the

Â inversion of any ciphertext begins with 02 or not. And it turns out that's actually

Â enough to completely decrypt any ciphertext the attacker wants, Just add

Â little bit of information leakage just by properties of RSA lest the attacker

Â completely decrypt a given ciphertext. Also here's what the attacker is gonna do,

Â well, he has a ciphertext that he wants to decrypt so what he'll do is he'll take a

Â cyphertext and of course feed that directly into the web server and ask does

Â it begin with the 02 or not. The next thing he's gonna do is he's gonna chose a

Â random value r and he's gonna build a new ciphertext c prime. Which is (r to the e)

Â c modulo N. Now what does that do? If we pull the r into the RSA function, really

Â what we just did is we multiply the RSA plaintext, you know the PKCS1 including

Â in m, we multiply that by r and that, that whole thing gets raised to the power

Â of e. Okay. So that's the effect of multiplying c by (r to the e). It multiplies

Â the plaintext by r, a value that the attacker controls. And then he's gonna

Â send c prime to the web server and the web server is gonna say yes, this starts with

Â 02 or no, this doesn't start with 02. So now we can abstract this problem to

Â something more general which you can think of as the following so I have this number

Â x in my head. This is the number I'm trying to get The PKCS1 encoding on m. I'm

Â thinking of this number x and then what I let you do is for r, which way r of your

Â choice I will tell you whether r times x mod n starts with 02 or not. And it turns out by

Â asking enough questions it turns out either by million questions of this type,

Â you can basically recover all of x. Just by learning whether r times x begins with 02

Â or not by asking enough questions, you can actually recover x. So in reality what

Â this means is that the attacker can capture a given ciphertext. This maybe

Â corresponds to a ciphertext where the user enters the credit card number or a

Â password, and now the attacker wants to decrypt the ciphertext. What he would do

Â is he would send about a million ciphertext like this, the web server for

Â each ciphertext is gonna respond whether the plaintext begins with 02 or not and at

Â the end of the attack, the attacker just blocks away with the, decryption of the

Â ciphertext c. So this might seem a little magical to you, how are you able to

Â just from learning whether the most significant bits are 02 or not, how you're

Â able to recover the entire plain text. So let me show you a simple example of this.

Â I'll call those bab y Bleichenbacher just to kinda get the idea across on how this

Â attack might work. So imagine that the attacker is able to send the ciphertext c,

Â the web server is gonna use the secret key to decrypt but let's suppose that instead

Â of checking for 02 or not, all the web server does is he asked is the most

Â significant bit one or not. If the most significant bit is one, the web server

Â says yes. If the most significant bit is not one, the web server is no. Moreover,

Â to simplify things, let's assume that the RSA module is n is a power of two. So n =

Â two^n. Of course, this is not a valid RSA modulus. An RSA modulus used to be a

Â product of two distinct primes. But again, just to keep things simple, let's pretend

Â for a second that n is actually a power of two. So now you realize that by sending

Â the ciphertext c over to the web server, the adversary just learn the most

Â significant bits of the plaintext x. Basically, the servers behavior completely

Â reveals what the most significant is. Now what the attacker can do is he can

Â multiply the ciphertext by two to the e. Now multiplying by two to the e has the

Â effect of multiplying the plaintext x. By two simply multiplying x by two and

Â because we're working mod two to the N, multiplying by two basically means shift

Â left, okay. So now when we shift left in fact we get to learn the most significant

Â bits of 2x which is really the second most significant bit of x, okay. So, again the

Â most significant bit of 2x basically we shift x to the left. And we reduce modulo

Â n. So now, the most significant bit of 2x modulo n is basically the second most

Â significant bit of x. So now we learned another bit of x. And now we're gonna

Â repeat this again. We're gonna query a four to the e c, That corresponds to

Â querying of 4x to the power e. Querying of 4x basically reveals the most significant

Â bit of 4x mod n. 4x four corresponds the shifting by two bits. Which mean now we

Â get to learn the third most significant bit of x. And when we repeat this again,

Â and again, and again for different multip les of c and you can see that after just a

Â few queries, about a 1,000 or 2,000 queries, we'll recover all of x. The

Â reason Bleichenbacher needed about a million queries is because he wasn't

Â testing for one, He was actually testing for 02 or not 02 and that basically means

Â that he needs instead of 2,000 queries he needs actually a million queries but

Â that's still enough to recover all of the plaintext text. Okay? So I hope this is

Â explains why this attack is possible. Why just this little bit of information about

Â the most significant bits of the RSA inverse is enough to completely decrypt

Â RSA. So the bottom line here is that PKCS1 as implemented in web server as up

Â until this attack was discovered was basically insecure because the attacker

Â could essentially decrypt a ciphertext he wants just by issuing enough queries to

Â the web server. So how do we defend against this attack? Well, so the SSL

Â community basically wanted the minimum change in their code that would prevent

Â this attack from working and so if you look at the RFC, what they propose is the

Â following. Well, there's a lot of text here but basically what they propose is to

Â say that if after you apply the RSA decryption, you get a plaintext that's not

Â PKCS1 encoded. In other words it doesn't start with 02. What you're supposed to do

Â is basically choose some random string r. And just pretend that the plaintext is

Â just a random string r and continuous as nothing happened and of course the

Â protocol will fail later on. So to be concrete you see if the PKCS one encoding

Â is not correct, what you would do is you would just say the premaster secret is

Â this random string or that we just picked up off thin air and let's continue the

Â protocol and of course the session set up will fail because the client and the

Â server will end up agreeing on different keys and that will cause the session to

Â terminate. So we actually don't tell the attacker whether the plaintext begins with

Â 02 or not. All we do is just pretend that the plaintext is some random value. So

Â this was a minor code change to many web servers and it was fairly easy to get

Â deployed and in fact most web servers out there today implement this version of

Â PKCS1. However, this kind of raises the question of whether PKCS1 should be

Â changed all together so that we actually are able to prove chosen ciphertext

Â security. And that brings us to a different way of doing encryption using

Â RSA which is called Optimal Asymmetric Encryption Padding, OAEP. And in fact the

Â PKCS standard was updated and PKCS1 version 2.0 actually has support for OAEP which is

Â a better way of doing RSA encryption. So let me explain how OAEP works. So OAEP is

Â due to the Bellare and Rogaway back in 1994. And the way OAEP works is as follows. So you

Â take your message that you wanna encrypt this for example could be the 128 bits AES

Â key. And then the first thing you do is you append a short pad to it. So in this

Â case, you put a 01 in the beginning and then you add a whole bunch of zeroes. How

Â many zeroes is actually depends on the standard but you know that's supposed like

Â they're 128 zeroes in here. And then you also choose a random value so that this

Â whole string is as big as your RSA modulus so let's just say this is 2047 bits. Now

Â before you apply the RSA function, you first of all take the random value that

Â you chose and you feed it into the hash function. This hash function produces a

Â value that's as big as the left hand side of your encoding. You XOR the outputs.

Â You feed the result into another hash function which we call a g. You XOR

Â that with a random value. And then finally, you get these two values that you

Â concatenate together. Okay, So, you concatenate either left side and the

Â right side that gives you something that says is 2047 bits long and that's the thing

Â that you apply the RSA function to. And the result is the RSA encryption. Now there's

Â a theory that was proved in 2001, due to Fujisaki, Okamoto, Pointcheval, and Stern,

Â that shows that in fact if all you do is you, you assume that the RSA function is a

Â Trapdoor permutation, a secure Trapdoor permut ation, but in fact this mode of

Â encrypting using RSA is in fact chosen ciphertext secure but we have to

Â assume that the functions h and g are kind of ideal hash functions as I said these

Â are called random oracles. Basically, we assume that h and g are just random

Â functions from their domains to their range. In practice of course when OAEP is

Â implemented, people just use SHA-256 say for h and g. Why is this called

Â Optimal Asymmetric Encryption Padding? Why is this o? Why does this stands for

Â optimal? Well, the reason is if you look at the ciphertext, you'll notice that the

Â ciphertext is basically as short as it can be. The ciphertext is exactly the length

Â of one RSA output, there are no trailing values that are appended to the ciphertext

Â whereas for example, the ISO standard if you remember even if you had to encrypt

Â a very short message, what you would have to do is you would have to encrypt x using

Â RSA and then append to that x, the encryption with the short message under

Â the symmetric encryption system. So even if you have just to encrypt a 128 bit AES

Â key, with the ISO standard you would get one RSA output plus a symmetric

Â cipher whereas with OAEP, you just get one RSA output and nothing else so in that

Â sense, it's optimal, Optimal in terms of the length of the ciphertext.

Â Interestingly, this theorem here really rely on properties of RSA. And in fact,

Â the theorem is known to be false if you use a general trapdoor permutation.

Â Some other permutation doesn't have the algebraic properties of RSA. So that left

Â this question of if we have a general trapdoor permutation, what is the correct

Â way to do OAEP? And it turns out, there's a minor modification to OAEP which makes

Â the result more general. This is a result due to Shoup back in 2001. Just shows that

Â if you give me a general trapdoor permutation f. It turns out if you replace

Â the fixed pad that's using OAEP by a hash function, this hash function w, which

Â happens to be a hash function of the message m and the randomness r that

Â you're encrypting with, and then during decryption, you actually check that this

Â hash function value is correct. So when you decrypt, you actually check the w of

Â mr is really what's written in this position in the plaintext. If you do that

Â then basically this OAEP called OAEP+. Is in fact CCS secure, Chosen Ciphertext

Â Secure for any trapdoor permutation. You don't have to rely on a specific

Â properties of RSA. There's another result called Simple Asymmetric Encryption

Â Padding, SAEP+ which basically says that if you are gonna rely on properties

Â of RSA, then in particular way with RSA when the public exponent is equal to

Â 3, it turns out you actually don't need the second stage of encryption. The

Â simple padding scheme here again using the function w is actually already enough to

Â guarantee chosen ciphertext security. So these are variants OAEP but they're not

Â really used. I just wanted to mention and to let you know they exist. These are not

Â really used mainly OAEP has been standardized is what's used. Although

Â again in reality, the most common application of RSA for public encryption

Â is in fact this PKCS1 that's standardized in the HTTPS RFC. So just to

Â make sure it is clear in your mind how decryption actually works, let me ask you,

Â how would you decrypt an SAEP ciphertext ct. So here, you're given the ciphertext

Â ct here and the question is which of these three methods is the correct way of

Â decrypting the ciphertext. So the correct answer is the first one and let's see why.

Â Well, given the ciphertext the first thing what we need to do is apply the RSA

Â inverse functions, the ciphertext and that will give us the RSA plaintext which

Â happens to be in this case x and r. So we get this x and r here. The next thing we

Â need to do is we need to hash r using the function h and XOR the result with x and

Â that will give us m and wm, r. And the last we need to do is to make sure that

Â the pad W(m,r) is correct so we check that in fact w is equal to W(m,r) and if so we

Â output m and if not, we output bottom to say that the ciphertext is invalid and

Â the decryption algorithm rejects it. And by the way I'd like to emphasize that this

Â checking of the pad during decryption is crucial in all of the schemes that we

Â just saw. So for example, in both OAEP+ and SAEP+, it's doing decryption.

Â It's very important to check that the pad is in fact correct so that the value of w

Â that you get here during the encryption really is the hash under the capital W of

Â m and r and similarly on OAEP, it's very important to check it during decryption.

Â In fact, the value of the pad is the constant 010000000. And of course if it

Â happens to be not 01000 then you output bottom and you say the ciphertext is

Â invalid. The last thing I wanna point out is that actually implementing OAEP can be

Â quite tricky and let's see why. So supposed you have, you write an OAEP

Â decryption routine that takes the ciphertext as input. The first thing you

Â would do is you would apply the RSA inverse function to the ciphertext and you

Â would say well, you expect to get an n bit value out, you know 2047 bits in the case

Â of 2048 bit RSA modulus and if you get something that's bigger than two to the

Â 2047, you say reject. We say error = one and we go to Exit. The next thing we're

Â gonna do is we're gonna check whether the pad is in fact the correct pad and again

Â if the pad is not the correct pad, then again, we're gonna reject and go to Exit.

Â The problem with this implementation is well by now I hope you kind of guessed it

Â is it's vulnerable to a timing attack, right. So essentially by leaking timing

Â information the attacker can now figure out what caused the error. Was it, that

Â was there an error the RSA decryption happened to be too big or was there an

Â error because the pad happened to be too large? And if the attacker can this

Â distinguish these two errors say by timing. Then it turns out similar to the

Â Bleichenbacher attack, it's possible to completely decrypt any ciphertext of your

Â choice. Just at very, very, very small leak of information would completely allow

Â the attacker to decrypt completely any ciphertext that he wants to. So this shows

Â that if you, even if you implement the mathematics of OAEP decryption correctly,

Â you can very easily mess up and open yourself up to a timing attack which would

Â make your implementation completely insecure. So as usual, the lesson is,

Â don't implement crypto yourself in particular, RSA OAEP is particularly

Â dangerous to implement yourself. Just use one of the standard libraries for example,

Â OpenSSL has an OAEP implementation and of course what they do is very careful to

Â make sure that the running time of OAEP decrypt is always the same no matter what

Â causes the error. Okay, So let's stop here and then in the next segment we'll talk

Â about the security of the RSA trapdoor permutation.

Â