In the previous module we looked at various kinds of randomized experiments, the assignment mechanism. And we looked at how randomization based inference allows you to test hypotheses, and estimate, as well, the sample average treatment effect. So, we assume some things then that get us back on the same page. The ith of the experimental units had two potential outcomes, yi(0) if you were selling treatment 0, and y1 if assigned to treatment 1. However, we only looked at one of these two because, if you get one, we can't see the other. So, could think of it as missing data problem with half the data missing. The potential outcomes in that module are regarded as fixed constants. And randomness got introduced through the random assignment of treatments to subjects. Now, in this lesson, we're going to extend the randomization based inference to the finite population average treatment effects. And, in this case, we look at the n units as a random sample from a population of capital N units. The potential outcomes are still fixed constants, okay? But, in addition to mission half the potential outcomes for the N units that we do look at, we don't observe either potential outcome for the remaining units, the ones that aren't in our sample. So now the randomness has two sources, the random assignment of treatments to the n subjects, little n subjects and the random selection of the little n subjects from the population of capital N. So the estimator the difference in the sample averages is y-bar 1 minus y-bar 0 from the previous module. That remains unbiased for the finite average treatment effect, but the randomization based approach to hypothesis testing obviously isn't going to work anymore. So let's talk about estimating the population treatment effect. The n units are simple random sample from the population of capital N units. So that means that each possible sample size N, taken without replacement from the capital N the units is equally likely. So, lets let Ti denote the random variable that takes the value of 1, if unit i appears in the sample, and 0, otherwise. Now, because of the way we're taking simple random sample, each Ti has an expectation of little n over N. Due to random sampling, the sampler average treatment effect from before can be rewritten as n inverse times this in the second line. And then now, you notice in the next equality following that, we're summing over all the units but multiplying by Ti. So that way, it's equivalent to the other. And now, this is a random variable because the presence of Ti. Okay, now the sample average treatment effect is on bias for the finite average population treatment effect. Okay, and so we going to take this and just show you immediately why that's true. And down below we see that the first equality, the second equality all we done is put the n outside and put the expectation inside, which we can do as you know. And then also now remember that little ys are constant and so the Ti is only random variable but the expectation of constant times the random variable is the constant times expectation of the random variable, and that what we get it's give us a constant line. Okay so now we want to talk about estimating the finite population average treatment effect. So for every sample, the difference between the sample means is unbiased for the sample average treatment effect. And the sample average treatment effect is unbiased for the expected value of Y1- Y0, then over the distribution induced by the sampling. So we can write this as an equation too. So first we're taking the expectation over the randomization distribution and then we're taking the expectation over the sampling distribution. So as in the case of the sample average treatment effect, the variance, or the estimator is tedious and depends on the unknown unit effects. The result given earlier for the sample average treatment effect that we saw is just a special case. So here now the variance of the difference between means has these three components and the first one you can see has to do with the difference of the y0s from their true value the second from the difference of the y1s from the true value. And the third, the differences of the unit effects from their values. Okay, now again, if the treatment effects are constant, you can see that the last term is going to be 0. But otherwise, we can get conservative estimate of the variance when we ignore this last term. So now we can do hypothesis test and confidence intervals for the population, finite population average treatment affect. And we can use a normal approximation for the distribution of the difference between sample means. And we get the familiar kind of result that expression there under the null hypothesis of no treatment effect, no average treatment effect will be approximately in normal 0,1 random variable. All right, I want to switch gears a little bit now. Now I'm sure you're more familiar with the model based approach inference in which potential outcomes are random variables. And throughout much of the remainder of the course, that's the approach we're going to take. So we're going to treat the little n units as a sample from an infinite population now most of the time. We can apply this in the case where the population does have a finite size as well. But we've already seen the basic ideas for how this approach works out from module one but it's worthwhile to reiterate and formalize this now here in the explicit context of the random experiments. And also to contrast this with the previous approach in which the outcomes are treated as fixed constants. So let's go ahead. So now you'll notice that I've written upper case Ys to denote i's potential outcomes. And I'm going to assume that expected value of Yiz is the expectation of Yz for all the i's for z equals 0 or 1. This would be what's happened if the potential outcomes were independent and identically distributed, which you're familiar with that. So for the completely randomized experiment, what this means is that the probability that unit i is assigned treatment little z conditioned on the unit's potential outcomes is just a probability of assignment. Okay, so we can write that more compactly below as Zi is independent of the potential outcomes. Okay, now, model based inference. Let's go back and look at the difference between the sample means as an estimator for the average treatment effect. Okay, the point is that it is unbiased, we just go through these lines, okay? So the first equality, I'm just re-writing Y1 bar and Y bar 0, it's all that's going on. Now this second, I'm shoving the expectations through the sums, okay? And I'm using the fact that the completely randomized experiment, the some of the Zis and the some of the 1- Zis are known respectively, and fixed to be N1 and N naught, they are not random. So going to the third line. Okay, now we know that the expectation of Zi is n1 over n and I know that the expectation of Y1 is just the expectation of Y1. So now I can pull those guys apart and multiple them because of the independent's assumption. So the independent's assumption functions in a similar way as getting us an expectation that before was a constant times random variable which would then be the expectation of the constant term, the expectation of the random. Always sort of doing something similar using the independent assumption. So now, okay, now you can see that it all comes out very nicely in the last line. I just have an inverse times the expectations of the y. And by essentially the idea assumption, that's just the expected value of y1 minus y0. Okay, and below, I've just put some of the things that I've just talked through that justifies those equalities. Now, we can move on as we did with the randomization based approach to talking now about not completely randomized experiments but block randomized experiment. So the difference here is that the treatment assignment probability could depend upon the blocking variable. It's not going to depend on the potential outcomes once the blocking variables included, but it could depend upon the blocking variable X. So now we're writing that treatment assignment is essentially unconfounded. The Zis are independent of the potential outcomes but conditioned on the Xis, the covariates. So what that means is that within each covariate group, the Xis, within each set of Xis, we have a completely randomized experiment. And from the results before, we can say that within that strata, the expected value of the difference between the sample means is the average treatment effect within that block. Now then if I average over the blocks, then I just get back the average treatment effect as you can see in equation 6. Okay, so that's fairly straightforward. So in the next module, we're going to take a different sort of cut on the same thing, a different sort of look. And we're going to tie this to a linear regression analysis with which you're all familiar.