Now what we're going to do is shift focus and look at a particular game played in the I P L season. And in fact we're going to look at the first game played on April 7th 2018 when the reigning champions, the Mumbai Indians, played in Mumbai against the Chennai Super Kings. So let's look at the data, so we'll open this. So we've got a data frame with all of the balls delivered for each team in each innings. So the way this data is structured is you can see here there is a total of 124 rows. Each row relates to a ball delivered in either the inning of the Mumbai Indians or the meaning of the Chennai Super Kings. So in some sense they the two innings are compared against each other. They are they are drawn alongside each other in the data frame and that will allow us then to compare the progress of the innings on a ball by ball basis. One thing just as a side note, you may notice that there are actually I said at the beginning that there are 120 balls in each innings, but actually you can see the number of balls can exceed 120 were. The reason for that is that according to the rules, some deliveries are designated nobles because they all breaches a rule for the way in which the ball is delivered. And that means that the bowling team has to deliver an extra ball so you can actually have a few more balls than 120 in a game. So that's a minor footnote, which is not really going to delay us too much here. So let's just draw, then a plot for the innings of the Mumbai Indians in this game that's referred to as M I. And let's chart the runs that they had scored, by the end of each delivery, compared to the number of the delivery in the inning. So here you can see the chart, so we have up 220 balls delivered. Those are the balls along the X axis, and you can see how the score rises as the number of balls increases. So they're they're scoring rate is actually fairly steady. As you can see, it's more or less a straight line, but you can see wobbles in the line when the scoring rate slows down or speeds up. and that's a factor that's particularly interesting in focusing on the performance of the team in the innings, how the scoring rate develops as the evening progresses. So what we can also do then is think about the relationship between the scoring rate and the number of wickets that have fallen. So each time a player is out, that's a fall of a wicket and that affects the scoring rate. Because wickets are like resources, you have 10 of them, which you can spend, and when they're gone, you're done, you're finished. And so as your wickets fool, you may change your strategy in order to conserve your resources in order to reach the end of the innings. So, first, let's just identify the Bulls when the wickets fell. So in fact, in the inning of the Mumbai Indians, there were four wickets that fell. And there you can see the delivery number, which ball they fell out and what the score was at those deliveries. And then we can combine a plot where we put the cumulative score and the number of wickets that have fallen together on the same chart. And that's what we do here. So you can see in this inning the red dots relate to the fall of a wicket. And there are two that fall early on in the inning and then to that fall somewhat later on in the evening. So they had 10 weekends in total as resources, if you like, and they didn't use up nearly all of those resources. but in fact, you don't want players to get out if they don't need to. So in fact, this was a fairly successful innings as viewed by the way in which the score evolved and the wickets fell. Yeah. Now let's compare this with the innings of the Chennai Super Kings, which followed them and by Indians. And we can see here if we just plot the scores you can see here are somewhat different evolution of the total score. So initially, in the first few balls of the inning, Chennai Super Kings were scoring at a faster rate than the Mumbai Indians. Then you can see just after about ball 40. Their rate slows significantly go. The rate of scoring falls below that of them one by Indians and then it accelerates again and indeed, towards the end it accelerates very rapidly and in fact what you can see is right at the very end. The orange line rises above the blue line, showing you that the Chennai Super Kings actually ended up winning the game by scoring more runs than the Mumbai Indians and they took exactly the same number of balls to reach that score. If we then add on the wickets into that chart, you can start to understand a little bit about what went on here. So if we compare the scoring rate of the Mumbai Indians in blue and the Chennai Super Kings in Orange, you can see that the Mumbai Indians lost a couple of wickets fairly early on but then were relatively stable. They lost a couple of wickets later on, but given that they had all the resources left, they were in a pretty good state whereas by comparison the Chennai Super Kings lost four wickets relatively early on in their innings and that really slowed down their scoring rate having only six wickets left. They were very concerned about trying to score too quickly. If you try to score too quickly, you lose wickets as well. And so they slowed down their scoring rate and they only start to accelerate really, in the latter half of the innings and in fact, if you look by just after say about ball 100 they lose a wicket and they're really quite a long way behind the total. And then they suddenly accelerate right at the end and actually reached the total. So, in fact, just by looking at the chart, you can see that this had a pretty exciting finish to the game as Chennai Super Kings look like. They would not reach the total, but in fact they did. Okay, so now let's load the data frame that includes every delivery for the entire season. So this is going to be a much bigger data frame again. We've got the same kinds of variables here, but what we're going to do is define a way of showing a chart this chart for any particular game that we're interested in. And so we're going to introduce you here to the idea of a function. A function starts D E f. And what that does is allows you to combine all the commands that you need to do something like create a chart all into one command and then just type in that command at any time that you want. So what we're going to do here is create a function that plots the runs and wickets in the each innings of the game all at once rather than doing this line by line, as we've done before. So in a sense, after the deaf hear all this really happened is that we've combined all of the commands that we've used above but all into one frame. And we've called that, plot runs wicket, which we can run the command and say, Run that command plot runs wickets and we can see the outcome when we're finished. So when we run the function, we don't see anything immediately. It just runs it, And what I'm going to do is create here a second function. This second function allows me to plot two or more games at the same time, and that's going to allow us to compare different games. So if I just run that function what what we can do now is define any two games that we want and show alongside the charts for the runs scored and the wickets lost for each team, and to see how two different games evolve. So which game should we compare? Well, in order to decide that, let's start off by identifying all of the games and getting some basic information about the teams. So first, let's identify from our data, which were the teams that batted first. We've created within this which, whether the home team was the team that batted first or not, This is identified for every single ball in the game. We don't need that. We just want to look at each game. So we're going to drop duplicate game numbers. So we've only got one game number for each team, so that reduces our data set quite considerably. And then we write a command, which is going to show us, which is the home team, which is the away team who was batting first in the game so we can make comparisons. So when we do this here, we've just created for each of all of our 60 games in the season We've got this names of the two teams and we can then use the I don't use the game number. We can look through this list and pick out any two teams that were interested in and use the game number to compare the plots. So, for example, we've looked at the first game between the Mumbai Indians and the Chinese Super King sets game number one. But in the Indian Premier League, each team plays each other team twice, once at home and once and away. So in here in the data, there's a game between the Chennai Super Kings and them and by Indians whether Chennai Super Kings are the home team. So let's find that game. And there you can see Game 27 is the Chennai Super Kings against the Mumbai Indians, where the Chennai Super Kings as the home team and they batted first. So what we can do now is we can use this plot runs wicket function that we created here to compare two games and we just change here inside these square brackets, we can just change the numbers. So we've got Game one and Game 27. And so if we now run that you can see here the plot for Game one, which we've seen already. We saw how that innings involved, and here we have Game 27 the return game, and what you can see here is that, once again, the team batting second was the one that won the game. But you can also compare the difference between the two innings. So bear in mind that in this game, the team batting second was the Mumbai Indians, and they end up winning the game. The difference here between the Games is that, initially, the Chennai Super Kings, who are in blue in Game 20 seven, they lose only one wicked early on in their game, and their scoring rate is very high. Then they lose another wicket and they're scoring. Rate slows dramatically, and indeed, towards the end of the innings, they quickly lose another three wicket, and that slows down their score. Whereas the Mumbai Indians a batting second, they lose one wicket right at the beginning, but the first ball or so of the inning. But then they don't lose another wicket until sort of midway through the innings, and then they don't lose another wicket until quite close to the end. And the fact that they have losing relatively few wickets enables them to keep their scoring rate up. And that's what helps them in the end to win the game. So what we've shown here is that you can generate these plots to make comparisons. You can go back now and compare any set of games that you're interested in looking at by changing the values of the game numbers in the plot runs wickets function and use these plots to then try and interpret what was going on in the game. Of course, it doesn't tell you everything about the way the game involved, and it's not a complete story, and we might want to take the analysis further, using more specific statistical techniques. But this shows you how plots can be used to generate insights and understanding about the way the game is being played in relatively simple and straightforward ways. So that's our example this week for cricket will move on next to look at an example of baseball and see how we can plot data there in order to tell different stories