0:00

So, let's speak a little bit about how hard it is to compute a Nash equilibrium

Â in a normal form game. So, let's, let's start with a little

Â history. John von Neumann, one of the founders of modern game theory, when he

Â investigated zero sum game it proved the, the existence of equilibrium there.

Â And he uses what's known as the Brouwer fixed point theorem for that. and that

Â led directly to algorithms for computing fixed points in such

Â in such linear programs. first those Danzig's Algorithm that

Â the real equivalent to what in modern day is called LP duality.

Â And it's an exponential procedure although it practice to use widely of

Â note is the Khachiyan polynomial-time approach to solving linear programs.

Â Although in truth in practice it's, it's not used as widely.

Â It's a practical procedure. And when you go beyond the zero sum games

Â so when John Nash approved the existence of equilibrium for general-sum games, he

Â used the same fixed point theorem of Brouwer.

Â And that also informed a series of algorithms and they're noted there on the

Â slide. And we will be looking at two of them,

Â the Lemke-Howson algorithm and a much more recent algorithm due to Ryan Porter

Â and others. I will note that all of these are

Â exponential in the worse case and I'll get back to that later.

Â So, let's start with the Lemke-Howson algorithm.

Â And let's start with a a formulation of the Nash equilibrium for 2-player games.

Â It looks, it looks as follows. and this is a busy slide but I'll walk

Â you through it and all will become become clear.

Â At heart are two sets of variables, the s's and the r's.

Â The s's will denote the will capture the mixed strategy used by the two players,

Â player 1 and player 2. s1, for example, of s, s2k for example,

Â would be the rate or the probability with which player 2 plays action k in in its

Â mixed strategy. So, the s1's and the s2's are the capture

Â the mixed strategy of the two players, player 1, player 2.

Â r is, are what they call the slack variables.

Â And to understand their roles, let's look at, for example, this equation right

Â there. So, this applies to any action of player

Â 1. For any action of player 1, we look at

Â the value that it the, the value that it give with respect to the strategy of the

Â of the other player. And so, we look at all the actions

Â available to player 2. We look at the pay-off to player 1 given

Â that he is playing a particular action, j, and looking at that action of player

Â 5:27

So, we're going to add this slack variable, as it's called, that will say,

Â is this is how much player 1 is missing relative to their best response when

Â they're playing strategy j. Those are the slack variables.

Â And, so now, that will also give us the sense for this condition here.

Â So, the slack variables are always non-negative.

Â And in a Nash equilibrium, they will be exactly zero, except when you speak about

Â strategies that are actually played with zero probability by the player.

Â So, now we talk about the s's. s's, as we said, speak about the weight

Â and the mix that each player gives to their each of their actions in the each

Â strategy they play. And so, when player 1 plays strategy j

Â with non-zero probability, it better be the case that is better, best responding

Â namely that the slight variable is zero. And when they're playing with zero

Â probability, you don't care what the factor variable is because they're not

Â playing that strategy at all. And you capture that by requiring that

Â the product to be zero. This is exactly the condition, and this

Â is what makes it a linear complementarity problem.

Â So, I hope that's clear and you can see now similarly for player 2.

Â For player 2, we take each of their actions and we say, if they're going to

Â play it with none, with, with some probability then, and we look at their

Â best response here given whatever player 1 is going to play their mixed strategy.

Â And we're going to look at the slack variable here, and again, we're going to

Â require that the product be zero. In other words the probability that they

Â play j is nonzero just in case the slack variable is zero, okay?

Â So this is the nature of this of this mathematical optimization program.

Â And, of course I forgot to mention, but of course, we want the, the s's to be a

Â probability distribution. Though, they sum to one and they are all

Â nonzero. Alright. So, this is our linear

Â complementarity program. And now come Lemke-Howson who suggest to

Â find a solution in a particular way. And we won't go over it, but the flavor

Â of it is to initialize the s's and the r's a particular way.

Â In fact, to artificially initialize them all to zero, and then one by one take

Â them, it's called a pivoting procedure. Take the, an, an s and an r in turn

Â alternating between the two taking them out of the set that has the current value

Â and replacing it with a complementary variable.

Â If it's an r, replacing it with an s. And if it's an s, replacing it with an r,

Â until you get a, an equilibrium. That's the general flavor of it and in,

Â in this lecture we won't go in more detail into the Lemke-Howson procedure.

Â But it is a procedure that looks very deeply at what a Nash equilibrium is.

Â Sets it up as a mathematical program, and then searches the space of variables in

Â an informed way. Let's now look at a very different

Â procedure. One that doesn't look in as much detail

Â as the structure of equlibria but compensates by by performing uristic

Â search in a certain. So so let's look at it and we'll look at

Â it at at two stages. The first step is to note that when you

Â fix the supports of a strategy profile, finding out whether there is a Nash

Â equilibrium with that support is an easy problem.

Â Remember that the support of a strategy is consists of all the actions to which

Â the players is giving nonzero probability in that mixed strategy.

Â So, let's look at this formulation. Let's look, and this will be limited to

Â two players. and so let's look at all players in turn, for example, player 1,

Â and let's look at every action of that player, for example, a sub i.

Â We'll be looking for some mix strategy, p, mix strategy profile for one for each

Â of the players that will give us a Nash equilibrium, namely each will be best

Â responding. And so, for all actions in the support

Â that we're considering, we'd want the agent to be best

Â responding. So, let's assume that the best response

Â value is v sub i, just call it that number.

Â Then, we want ai to in fact to be a best response to the rest. And what we want is

Â all other actions, and the other action, none is support, to

Â give us a value that's no greater than the best response.

Â And, we want it for each, each of the two players and each of their actions in the

Â support, so that makes sense. And we want these to be a probability so

Â we want the probabilities in the support to be nonzero.

Â We want the probabilities outside the support to be zero,

Â and we want it indeed to be a probability distribution.

Â This all makes sense. So, this is a linear program.

Â It's solved only in polynomial time. that is theoretically there is a

Â polynomial time procedure. in fact, these procedures used are not

Â polynomial of the worst case but but nonetheless effective.

Â By the way, notice that we actually did use the assumption that we're fixing the

Â support here. Superficially, you might look at it and

Â say, oh Â·Â , I could do the same thing but simply ignore the support part.

Â Where, where are we using that really? Well, we're using it in the assumumption

Â that when we're best responding inside and don't have a profit, proper

Â abbreviation. we're actually playing these pi's with probability with a

Â positive probability. Because if we, playing the remaining

Â strategy with zero probability, in fact doesn't matter if we're best responding

Â to it or not. And so, this is where this assumption is

Â hidden. So, now we know that when we fix the

Â support we can solve the question efficiently.

Â the [UNKNOWN] is the fact that there's an exponential number support to explore.

Â And this is the second part, we need to simply now start the exploring the the

Â sets of support. And, I won't go into details about how we

Â do it. But the basic idea is the following. We

Â will bias the support to supports that are close in size to one another, that is

Â we will not start by considering one agent looking at only two strategies

Â among which is randomizing, and the other agent looking at 17 strategies.

Â We'll look at a strategic profile of the similar whose support is similar in size.

Â We also start with small supports and gradualling with the larger supports.

Â If we do that and we involve, and we, we use another trick called conditional

Â domination that is look at certain thing we can ignore along the way, then what

Â turns out that although the procedure is in the worst case exponential as is the

Â Lemke-Howson in fact it performs in practice quite well.

Â And in fact better than essentially all other procedures tried including the the

Â Lemke-Howson. This procedures do have exponential worst

Â case and so the question is, can we do better?

Â Are there procedures that are less than exponential in the worst case?

Â And that takes us from the realm of complex algorithms to the realm of

Â complex analysis. So, let's first remind ourselves a little

Â bit about what complexity analysis looks like.

Â We're looking at classes, whole classes of problems such as the class of old

Â games, and the problem of determining a a sample Nash equilibrium of those games.

Â And we're looking at the how hard is that class as a whole. And

Â so, here are, it's a small part of the complexity hierarchy.

Â The class P as it's known is the class of problem for which a polynomial kind of

Â solution is, is known. The class NP is the class of problems for

Â which a class a solution can be verified in polynomial time, but not necessarily

Â found in polynomial time. The class NP-complete is the hardest

Â among all the NP classes, that is the classes to which all NP problems can be

Â reduced. And perhaps, the biggest

Â answer problem in theoretical computer science is a question of whether NP is

Â different from P, widely believed to be but has not been proved.

Â So, this is the general background we need to keep in mind.

Â And now, we can ask, well, where does where does the problem finding a Nash

Â equilibrium reside in P, in NP? What can we say?

Â Well, first of all, strictly speaking, we can't quite speak about it being in P or

Â NP because we know from Nash's theorem that a Nash equilibrium always exists.

Â So, the question, does it exist in Nash equilibrium is trivial, the answer is

Â yes. So, we need to look at it a little

Â differently. One way to look at differently is to ask

Â for Nash equilibrium with specific kind of properties.

Â So, for example, we can say does it have unique Nash equilibrium? Or does a,

Â existent of equilibrium that's strictly Pareto efficient?

Â Or does, is there a Nash equilibrium that guarantees a given player some minimum

Â pay-off? Or can we guarantee some minimum, some minimum social welfare in a

Â Nash equilibrium? Does the existent Nash equilibrium that include, that includes

Â some, fewer, fewer strategies, some action of the player in it?

Â Or conversely, that does exclude it? All of these and others, are examples of

Â questions that are provably NP NP hard. Okay. So, that tells us something about

Â the hardness. But still we still ask about just finding a sample in Nash

Â equilibrium. How hard is that? So, we've seen the

Â algorithm. People have tried very hard to find

Â algorithms computing a sample in Nash equilibrium.

Â and it does seem hard. The question is, can we somehow capture

Â that formally within the complexity hierarchy? And and and for that, we need

Â to introduce some new node, new new concepts.

Â the essential concept is that of the new class of problems called PPAD for

Â Polynomial Parity Arguments on Directed graphs introduced by Christophe

Â Papadimitriou in 1994. we won't go into detail, but just so you

Â know the chronology. PPAD is a specialization of the class

Â called TFNP, which is in turn was a specialization of a problem called FNP.

Â going detail here is, is, is beyond the scope of, of, of what we want to speak

Â about. But it does help us now position the

Â complexity of finding a sample Nash equilibrium in the complexity hierarchy.

Â And again, we have the class of polynomial time problems,

Â of problem that can be verified in polynomial time with these being the

Â hardest among them. And given that, PPAD turns out to reside

Â somewhere within this class. Now again, we don't know whether this

Â entire class does not collapse and all become one and the same.

Â It's why they believe that it does not but proof doesn't exist.

Â However we do know that PPAD lies someplace in between P and NP.

Â Now what does that have to do with the problem computing a Nash equilibrium.

Â Well, that's where the, the following theorems come in.

Â originally, it was shown that the problem of computing a Nash equilibrium is

Â complete for this class PPAD. That is, it's the hardest among all problems in

Â that class, initially proved for four players than for all, for games with

Â three or more players. And then, finally, in '06 for all, all,

Â all classes game. And so, we widely believe that the

Â problem is not polynomial, cannot prove it but we do know where it reside and

Â within the complex hierarchy that we are familiar with.

Â