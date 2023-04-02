Hello, learners. In this video, we are going to talk about why does sampling work at all? We said sampling is all about collecting a small set of data from the entire population which could be very large. Now, why should something that is small and just a part of the whole be a representative of the whole? Even in the context of sampling from a finite population, when we say the small sample is a certain fraction of the entire population and it is likely to have some characteristics, what about infinite populations? Whatever finite sample that you might collect is still going to not be even any sizable part of the infinite population. Now, why is that going to be a representative of the infinite population at all? Why should that be useful at all? The answer is given by the law of large numbers. The law of large numbers says that as long as you have sufficiently many samples, whatever sufficiently many means, then it is going to become a representative of the entire population. Your population could be infinite, but your sample could be finite, but if it is a large enough finite number, then the sample characteristics are going to reflect the population characteristics. This is guaranteed by something called as the law of large numbers. In fact, all the sampling techniques work because we have law of large numbers. Because it helps us to derive a conclusion about the whole by looking only at a part of the whole. In the example that we have been talking about where Krishnan goes to an eatery in IIM Ahmedabad and orders lunch, let us look at how the law of large numbers or a sample of a few days of observing Krishnan eat could give us information about the population, which means a set of ways in which Krishnan might behave. Let us go to the Excel example. Here we have the probability distribution that we have originally discussed. Krishnan spends Rs 50 with a probability 0.25, rupees 70 with a probability 0.35, 100 with a probability 0.25, and 150 with a probability 0.15. As before, we have calculated the expected value of X and that turned out to be 84.5. In this column G, I am generating a number which follows exactly this distribution, which means this number is going to be 70 with probability 0.35, 50 with a probability 0.25, 100 with a probability 0.25, and 150 with a probability 0.15. If I refresh the Excel sheet, you are going to see different numbers and they are going to happen exactly according to this probability distribution. I can pretend that a number I see here is representative of what Krishnan spends on any single day. Now let us observe Krishnan over a period of 10 days. The numbers you are seeing here corresponds to Krishnan's spending habits over a period of 10 days. As I refresh this Excel sheet, you are going to see different set of numbers every time. Now let us look at the average of these 10 numbers. The average of these 10 numbers, you can see or convince yourself that it is oscillating around the central tendency which we are expecting theoretically, 84.5. When I make this observation, which means I'm sampling Krishnan's eating behavior, I do not know these exact probability distribution. The only thing that I know is the actual money that Krishnan spends at the eatery. I do not know this underlying probability distribution however, I am interested in the mean of this probability distribution. All that I observe is the actual money that Krishnan spends on multiple days. What I have here in column G could be considered as a sample from a population which is unknown. Well, let us look at more samples. Let us look at 100 samples. Well, the average looks a little closer to the expected value. Let me refresh the Excel sheet. The variability around seems to be a little less extreme compared to what was happening earlier. Let me take 1,000 samples. Well, it is even closer. I'm never getting 90s or 79s any longer. It's oscillating very close to this true expected value of 84.5, and this is exactly what the law of large number predicts. As you take more and more samples here and take the average value of the random variable, then this value is going to tend to the expected value of the corresponding random variable. Which means when you take samples in this setting, this is an instance of simple random sampling, have sampled this random variable multiple times and then looking at the average, and they indeed seem to converge towards the expected value of the random variable. If we take an even larger number of samples, let's say 10,000 samples, then the variability is going to be even smaller. In fact’ I'm going to direct your attention towards the graph which I'm pointing at now. Let us restrict to 200 samples. The numbers in column Q correspond to taking average of only the first n numbers in column G. For example, the number you are seeing against 14 in column P, this 71.42, corresponds to the average of first 14 samples in column G. Well, we are now looking at how the average varies as we include more and more and more samples. You can see the average seems to start at 70, which seems to be a gross, and the estimate goes a little up to 80, oscillates a bit, but it seems to settle somewhere near the true expected value, 84.5. If I refresh the Excel sheet, I'm going to get a different sample, but here it was even bad. The first sample turns out to be a small mean and as a result the average was very really low, but again, you can see it oscillates and settle at 84.5 as expected. Here you have the reverse case. You start with an abnormally high number but again, as you take more and more samples, the average seems to settle down at 84.5. If you take really large number of samples, if I take 2,000 samples, then well, it seems to not at all oscillate much after a certain point. It indeed seems to converge very close to 84.5. As the number of samples become larger and larger, tending to infinity, you are likely to have numbers that are extremely close and practically equal to 84.5 for all uses and purposes. Well, one might wonder, now, how is this going to be really useful because we seem to only be able to estimate the expected value of a random variable using this law of large numbers and potentially using a large number of samples. What about probability? For example, can I find, what is the probability this random variable X is going to take a value 50? What is the probability that random variable X is going to take a value 60? Is that easy to find? The answer is yes. We are going to define a new random variable such that the expected value of this random variable will be equal to the probability the random variable X takes a value of 50 or 60 or anything that we want. By doing this, we can use the law of large numbers again to estimate each and every population parameter that one might be interested in estimating. Let us look at the Excel example again. Suppose a person is interested in identifying what is the probability the random variable X takes a value 0.25. We have to just define a new random variable called Y, which takes a value 1 if X takes value 50 and then value 0 in all other cases. Well, that's exactly what we have done here. I am checking if the value in the column G, if it equals 50, then the value in this column will take a value 1, otherwise it will be zero. Similarly here I'm looking at the probability that random variable X takes a value 70. Here again, this column will have one if the corresponding entry in column G is equal to 70, otherwise it will be o. Similarly here for 100 and here for 150. Now we can convince ourselves the expected value of each of these random variables is nothing but probability that random variable X takes each of these individual values. Well, now I have this random variable generated, and now I can again consider the averages and these are going to give me the individual probabilities that random variable X takes. Let us say we are interested in estimating the probability that the random variable takes value 50. In this case, we have to look at column H and its average. We see a similar behavior. We see these numbers are oscillating a bit before settling into something probably close enough to 0.25. Maybe if we take more samples, let's say we take 10,000 samples, you can see this gets arbitrarily close to 0.25. Again, initially you can see a large sway from 0 to 0.35, etc but as you have more and more samples, we clearly see that we are getting the expected value of this random variable, which corresponds to the probability X takes a value 50. Similarly, we can look at the probability X takes a value 100 as well or 70 as well. I'll do this and I'm going to drag this all the way down and you can see this indeed settles very close to the value 0.35. You see with small number of samples the average value seems to be zero, which probably is not so useful but as we collect more and more samples, law of large numbers kicks in and we have excellent convergence properties. It is not only for these individual probability values that law of large numbers can help, we talked about a different random variable also earlier. We talked about a random variable Y, which takes value one if X takes value less than 80 and value zero otherwise and we had also theoretically calculated the expected value of this random variable as 0.6. Again, you can see this happening in practice as we have more and more random variables in our data. You can see some oscillation in the beginning, but you see the average value eventually converges to 0.6. This is an excellent observation or use of how law of large numbers helps us understand population characteristics by collecting larger samples because when we look at the average value of a random variable in all its realization, we see that it eventually converges to the population mean which is the expected value of the random variable. In short, the law of large numbers says, "As we take larger and larger independent and identically distributed samples, sample characteristics represent population characteristics." Now, we are going to also finally discuss one part of the statement which we have not discussed so far. What do we mean by independent and identically distributed samples? We will talk that in the upcoming video.