Let us know speak about Backward Induction. And Backward Induction can be viewed as a way of computing the subgame perfect equilibrium of a game. It's a procedure that's used, widely. Or variants of it are used widely in game playing programs. Be it, chess, or, or other ones. And, and, and how does this work? So this is a busy, a busy slide and don't be daunted, we'll explain it leisurely. also the, some of you may have not seen algorithms before but we'll explain the algorithm in very plain terms. Before we do, let's just first give the intuition. The intuition is very straight-forward. What we are trying to do is associate a value of the game not only with the leaf nodes, with leaf nodes we know what the value is. It is simply defined by the. Payoff vector associated with the leaf node is part of the game definition. But suppose we wanted to go to the root of the node, or any other internal node, and say so what really is the payoff associated here assuming that agents will play a subgame perfect equilibrium? And that's the goal of this procedure called Backward Induction. And the situation is very straight forward. We'll go to the leaf and back up slowly. If at every step of the, of the way. assuming the agent will maximize. take an action then maximize their payoff. at that node. And so that's intuition. Now let's see how it's done formally. The procedure is called Backward Induction, and it takes one argument, a node, a node in a tree, any node. It could be the root, it could be a leaf, or anything in between. And, of course, every node has some player associated with it, and just anticipate what we'll encounter shortly. Row of h, will be the player associated with that node h. And, when the procedure returns it'll give us, those pay-offs, eh. Pay-offs of, us, to have all of the agents, associated with that. Nodes. So how did that work? And again remember, remembering our intuition eh, we say the following. If h is a leaf node, Z is a set of leaf nodes. Ifs h is a leaf node, then we simply return The path vector as defined by the game. That's where the recurrsion bar comes out. Most of the work of course happens in the recurrsive step when we're not at the leaf node. So for that we do the following. We will keep a A variable called best_util, and best_util will be a vector, a vector of payoff associated with the agents each one, one with each agent. And we will be updating that, that vector as we, as we go along. So to start out with we'll assume the payoffs are all terrible. Let's call that minus infinity. a payoff smaller than all possible payoffs in the game. And then we do the following. We will look at all the actions. Available at that node. So [INAUDIBLE] of h is a set of all actions available at that node. A would be an example of such an action. So, well take each action, in turn one at time, and do the following. We will look at the child you reach by taking action a at that node h. That's called sigma h of a. So sigma of h of a is simply another node, 1 of the children of node h that you arrive at by taking that particular action a. And we will recursively look at that vector associated with that child and we will, keep it at that var-, at this variable called util.at.child. So we have 2, 2 vectors, notice. Best.util and util.at.child. Best.util is the best we found so far. Best for a particular agent. And util_at_child is what we found at a particular child will be going over all the children one at a time and if ever the util_at_child is better for the agent than the best.util so far. We'll update the best_util. That's what's going on here. Here. So this is what this says. It says util_at_child is a vector. So look at the element of that vector corresponding to the agent. We care about the agent in, node h. If the util_at_child is better for that agent. Given the best.util, what you've found so far is simply updated. Update best.util to be util.at.child other wise leave it unchanged and so in this way we're cycling through all the children and finding from that agent corres- that to whom [UNKNOWN] from his point of view Which of all the vectors are bests and again the intuition is you will take the action leading to that child and updating that that vector accordingly and that's why, when we're done, we're returning the best we found so far called best.util. That is the Backward Induction procedure. And notice that we don't have to return a stradegy, just return the the list of payoffs. And, in some sense it's you can think of it simply extending the pay off from the nodes to all the internal nodes. But even though we don't explicitly return the equilibrium strategy. The one that'll be sub game perfect. It's easy to read it off, those numbers. Because, at every node, the agent will take the action that leads it to the nodes. The child node with the best utility. From his point of view. So, that is the subgame perfect, That is a procedure for computing subgame perfect equilibria. The Backward Induction. And if we look at the special case of zero subgame, it's simplified a little bit. Because then, there are only 2 players. And the payoff for one is minus the payoff to the, to the other. So really, we only need to keep track of 1 number associated with each, with each node. And so, there's less bookkeeping to be done. And furthermore, In, such zero sum games. And all win lose games are, are, are zero sum. Game, for example chess. There is a way to speed up the Backward Induction procedure, by the way, in the zero-sum games we simply call it the minimax procedure, because we alternate between minimizing and maximizing the value. One player wants to minimize it, the other to maximize it and in fact there's a way to speed up the procedure and we won't go into it here, but the intuition is that as you're visiting a certain child, children of a given node, you may find out that at that point there's no, no need. To explore the remaining children of that node as we did in the Backward Induction procedure, because intuitively it won't matter. You've already found a value that means that this node that you've examining, the current node will never be visited. And it's called the alphabetic al, alpha-beta pruning procedure an optimization of the Backward Induction or the minimax procedure for zero-sum games, and you are invited to explore it elsewhere. There's one more thing I want to say in connection with Backward Induction, and in fact The sub game perfetion and and sort of two different things here and they're all keyed off. The same example, the famous example of the centipede game, so this well known example you have two players, they alternate turns. We have player one moving And then player two moving, and then player one again, then player two, and so on and so forth. But the payoffs are constructed in a contrived way, so that they're gradually increasing. And, you can imagine, its called centipede, because you can imagine rather than having only Five legs, here you have a hundred. They slowly arise so that the payoffs here are much smaller than the payoffs here. And indeed very much so if you keep going. But even though they rise, they are contrived in a way that lead to only one sub game perfect equilibrium. Players will defect in every place. Their [INAUDIBLE] will go down in every place here. So the only outcome, subgame perfect outcome is this one, where the first player defects immediately. It goes down immediately. Which is of course similar to the prisoner's dilemma, is a little counter-intuitive because had they only had the good sense to keep going they would get, have got to something in the ballpark of this or this, both of which are much better for both than here. But nonetheless, when you examine you see that there's only one uhs, one, one, one sub game perfect equilibrium here and fact one, only one equilibrium outcome, namely this one. And you can see it by doing again the back one deduction production procedure. If the game progressed and in fact reached this node, what would player 1 do? Well, they would go down getting a 4 rather than a 3. Player 2 knows this. So knowing that player 1 would go down, he'd rather go down because he would get a 4 rather than a 3. Similarly here, player 1 knowing that player 2 would go down. Elects to go down here because they will get three rather than two and so on. And this is really the Backward Induction argument. So on the one hand clearly a clear account of what will happen in this game but there are two problems. One of them is sort of simply experimental and common sense and the other is, more theoretical. On the pragmatic level, if common sense simply tells you, this is not what's, what's going to happen. The players will cooperate for a while. Until, at some point, in fact, somebody would go down in the game. They know there's so much to gain by going forward. They would, if you wish, take the chance. And this intuition is for now repeatedly in experiments. People do co, cooperate for a while until in fact eventually they, they defect. So that's a problem for theory. But there's also a, another problem that's theoretical in nature. [INAUDIBLE]. So we know that the only subgame perfect equilibrium is one where agents defect. Go down at any point in time. Now, imagine that the game starts, and player one goes across. Does not go down. What does player two do? Well, on the one hand, you could say, well. The only sub game perfect equivalent/g to go down, they should go down, because of those [INAUDIBLE] Backward Induction argument and it'll tell them that the best thing for them is to go down. But that same argument told them player two that player one would of gone down right off the bat, but they didn't. So, Maybe they won't go down again. But what will they do? And fundamentally what happens here is you an event of going across that the standard theory tells you Will happen with zero probability. How do you condition on a ca, an event that had a zero probability prior prior? there's there's, there's a big literature on this. It's a very interesting deep issue[UNKNOWN] in game theory. We will not delve into it any deeper, but it's interesting to know.