[MUSIC] Hey, in today's material we're going to explore a radically new approach to producing high quality AI in our Hex game. And what's remarkable about this is that this was discovered about 2006 when investigating the game of Go at where randomized moves made on the Go board led to a much better way to explore had to produce a good move and what was up to then quite intractable problem. So we're going to learn how to use this Monte Carlo scheme and apply it to our term problem, our X game. Now, we'll also talk a fair amount today about how to deal with making sure your code is correct especially the use of assertions, static assertions which are new to C++ and exception handling. Those will be our key programming points. Monte Carlo again, we've seen this in other areas. It's basically a technique for using probabilistic simulation to understand the phenomenon. Typical phenomena that you might try to understand with probabilistic simulation on a computer would be something like weather prediction. And then assertion and exception handling. These are features that go back originally to the C programming language and then an enhanced way got added into the standard library, especially the exception hierarchy library. So, if you're familiar with classical AI, where you're playing a game like Hex, or a game like chess, or a game like Go. Basically, going back to von Neumann and Morgenstern, and this is material in the 1940s. There was sort of a standard point of view and von Neumann was great applied mathematician and Morgenstern was a mathematical economist. And they were sort of looking at how you evaluate situations in order to make optimum moves. And this is a basis behind lots of decision making so what we're talking about isn't just something applying to AI but almost any area of decision making, even whole subjects like operations research critically study how do you bring rationality to the board. How would you run a company rationally? Anytime you have a decision and especially a decision where there are multiple outcomes and some of the decisions could be probabilistic. And in these classic points of view one assumed a rational opponent who is capable of selecting a best move. And so, you have a whole number of opportunities. You have to make a move from a large selection of moves and for example in chess. A typical chess position that's around 40 moves and you have to use some criteria because, especially if you're a human, 40 multiplies exponentially quickly. And so you would be in the many millions of possibilities after just a couple of moves. Your move, your opponent's reply is already 1600. So it's sort of classically two moves or four cly you're well over a million distinct possibilities. Now, grand masters look at only a few of those, because realistically, many of these moves are known to be irrelevant, and they use a criteria, and these are normally called plausible move criteria. And among plausible move criteria in chess would be examination of checks. So it's always the old adage, try a check it may be mate, mate wins the game. So checks, captures, sometimes advancing pawns. These are criteria and they may vary during the game. And similarly there are criteria in games like Hex. Think about Hex. And in Hex as we've already explained, good Hex players learn to make bridges. Bridges are ways to jump across the board. And also they make blocking moves. So a Hex player isn't likely to look at moves around the edge of the board. So maybe many moves around the edge of the board. But the moves that are rich in hex are going to be in the middle of the board. Or moves that plausibly extend your ability to link up. If you're a North-South player you're going to look for links North to South and so after you decide for your plausible moves, and don't forget, if you're on a big hex board, much like go, that number is actually quite a bit larger than 40. So in our hex game that we're using, the 11x11 board we have something like a, for quite a while, well over a hundred moves to consider at each ply. So at hex, 2 ply is already 10,000, right, and 4th ply is already a hundred million. So you can't routinely in a min max situation pursue Hex to any great depth. So you have to pare down very quickly in the normal look ahead your plausible moves. You have to select three or four plausible moves. At three or four plausible moves to your opponent before you reach a position where you can have in our third point here, your static evaluation. So you typically, in an AI situation, the side on plausible moves, make your plausible moves, make your opponents plausible moves. Keep making plausible moves as long as you have computing resources. Until you get to a situation in which the evaluation is relatively clear is maybe called the leaf nodes. Now, in the ideal situation in the abstract situation. Leaf nodes are going to be nodes in which we know who has won or loss. Where we have a known final value. But in this very difficult games of interest to humans, we generally can't get down to the final situation of the board to a known result. But we get to a result which we have a very good guess what we call the position, the nature of the position, is static. And we can use some ability to calculate, like with hex, who has the longest potential path across the board. Whoever has currently the longest path is the likely winner. Whoever has the most men borne off in backgammon is the likely winner. Whoever has the most material in chess or checkers is the likely winner. Whoever dominates area in a Go game likely winner. So then we have a big tree and nowadays those trees can have tens of thousands even tens of millions of nodes. And we apply our alpha-beta to pick our moves. So we've seen that and so far. We've concentrated on what you might call the classical techniques in AI for building our game program. So, little quiz. If for example you have 2 plausible moves and a tree of 20 ply. And don't forget, X is potentially something that's going to be 120 ply. Chess is going to be something that's 100 ply. Go's going to be something that's on the order of 400 ply. So you have a 19 by 19 port. So how many for this kind of simplification, approximately how many leaf nodes are you going to have to evaluate if you're classically using min max. Take a second then think about that. Now the answer is a approximately a million, because you're basically getting a 2 to he 20th. And if you're long time computer scientist like me everybody knows your powers of 2 and roughly speaking rule of thumb, 2 to the tenth is 1000 give or take 24, so you're having 1,000 squared, that's a million. So even where you have narrowed down your plausible moves to a fairly simple structure, you can still get a very large tree. So here are some of the problems with the classic lookahead model. Plausible moves can be unreliable. Indeed in the early days of chess programs where you had seven or eight plausible moves, they didn't play very well and it was frequently the case that it was okay for a to have a plausible move but insightfully, chess masters figure out something that is what you might call eccentric, unseen. That's how they start off their lesser opponents. So they might create a sacrifice of appearing to give up lot's of material. Well, that wouldn't be a plausible move. Sacrificing your queen is not a good idea, most of the time losing your queen loses your game for you. So if you were to have a fairly, rigid plausible move generator, you might be missing critical moves. So it indeed it turns out that the later supercomputers like Deep Blue, the computer that defeated Garry Kasparov. Actually had special chess hardware that examined, at least for awhile, every legal move so that they wouldn't miss something. At least they wouldn't miss it for a few levels. And that was one of the great improvements in programs like chess and checkers and Othello, where they discovered that an amazing amount of computation even though, it seems like that wasn't the answer because the trees get too deep was indeed extremely helpful. It was indeed enough to uncover situations that the ordinary beginning player would miss. So, even if they had poor evaluation functions if they added enough exhaustive search it was very helpful. So plausible moves can be unreliable. Some positions have an extremely larger, what's called the horizon effect. That you can examine the board and not see that 40 or 50 moves later, the game will be one. And the characteristic here in a chess game, and it probably exists as well in go, is there are certain port formations, and those port formations create a very static situation because they're hard to penetrate. And in those point formations, where everything is fairly static, and certain things can't be changed, you can instead of saying I go here, I go there, you could say if I march my king around that point chain, even if it takes me 25 moves, I can get to a place through the opposition and don't worry about all this if you're a chess player of some ability, then you'll know what I'm talking about. But this can be foreseen by expert players, because they know, the expert player in chess or in Go, know that a certain situation is independent. Of how far ahead you need to look to take advantage of it. And early chess programs frequently would stop and do a stack evaluation and it would be below the horizon Where the chest master knew that this deep structural situation gave him a big edge and the program didn't know. So indeed, there were anti-program strategy. So even when the programs on average could play at master level chess in the 1980s, players of world class, like Kasparov, still had no problem. What they did was they used what was called an anti-computer strategy, made sure the game was quite static so the computer couldn't use its combinatorial advantage. Its short term advantage and stuck to things that they understood to be strategically long term assets that the computer was poor at evaluating, and they established those knowing that at some point they would be able to cash in on those. And then pure and simple with the lookahead model, you still get exponential problems, you still, no matter what. Now we do have very high quality play, in some cases, play that goes beyond human ability, but the exponential character of these games and make them p space on a large enough boards. There is no way to get the computer in a reasonable time to play perfectly. So having this exponential problem, the exponential explosion which happens in all sorts of AI domains, still means that sometimes decision making is poor.