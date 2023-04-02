[MUSIC] In this video, we will introduce you to some key terminology used in point estimation. We shall do this with the help of an example. Now suppose you have a company that was incorporated five years back and, at that point of time, the 1000 employees of the company were given 500 shares each. Five years have passed, and now suppose you want to find out what happened to this number of shares held by the employees. You want to find out the average number of shares that they hold, you want to find out the standard deviation of the number of shares that they hold, you want to find out the range of the number of shares that they hold, and so forth. Now, if you could contact all the employees, you could get these values, you could find out that employees on average hold 274.2 shares. You can say that the median of the number of shares held by the companies are 266 and so on, these values that you get are called parameters. Now, you understand, you can find them only if all employees share their data with you. However, in reality, you won't be able to contact all of them. Some may have gone away somewhere and some may not even be willing to talk to you about their holdings, but some employees may still be willing to share their data with you. The question is, how do you use the data that these employees give you to get an estimate or a guess at the parameters? Now, what do you do? You have a lot of employees, and some of those employees are willing to talk to you. The employees who are not willing to talk to you are shaded out here. Now, each employee would have a certain number of shares of the company still with them. The numbers that you see in black are the numbers that are available to you, those that you see in gray are not available to you. So with the numbers available to you, what do you do? You can take those numbers and form a list. Now, this list or this set of numbers is called a sample. Now, if the employees that were willing to talk to you were chosen at random, which meant that every employee had an equal chance of being one of them who would talk to you, then the sample that you would get is the random sample. Now, using the values in this random sample, you can have a guess at the average number of shares held by all employees in the company. You do this by taking the average of the values in the sample. That is you choose the sample mean. In this case, the sample mean turns out to be 308. If you wanted to find out the median value of the number of shares held by all the employees, you could take the sample median which turns out to be 315. Now you will see that these values of 308 and 315 are not the real parameter values, but this is the best that you can do, given that you have only 10 people out of 1000 who are willing to speak to you. Now you see the sample average and the sample median are sample properties which you are going to use to take a guess at the population parameters. These properties are called estimators and the values that you get by applying an estimator over a sample is called an estimate. So estimates are your best guess about population parameters. Now let us see how we can take samples from a population. So let us suppose that this is the data about the 1000 employees of the company. So there are 1000 numbers and each number gives the number of shares held by a particular employee of the company. Now, on the right, you see the parameters of this population, this population has a size of 1000, the mean of all the values in this population is 274.21, the median is 266, the range is 425, and the standard deviation is 122.65. Now, these are values that you would have got if you had access to all the data in the population, now you don't have that. So what you do is that you choose an estimate based on a sample. So here is a sample of size 10. The first column has the identities of the 10 observations that are part of the sample. So for instance, the first member of the sample is employee number 593. The second has line number 303, and we will assume that in this sample, every one of the 1000 employees has an equal chance of being in the sample. The second column shows the data corresponding to each of the 10 sample points chosen inside the sample. So employee number 593 has 270 shares. Employee number 313 has 440 shares and so on. Now, if we look at values in the sample, we can compute the mean value of the sample, the median of the sample, the range of the sample and the standard deviation of the sample. These are given in the columns shown here. So these values that we have are set to be the estimates of the population parameters. Now, you would notice that the population mean is 274.2, the population median is 266, the population range is 425, and the population standard deviation is 122.65. So none of the sample estimates match the population parameters. Not only that, if we change the values in the sample. For instance, here we are taking another sample, you will see that all the sample estimates change, the population parameter remains the same. Now, if we keep on changing the sample, you will find out that all the sample estimates will keep on changing and the population parameter remains the same. This is nothing particular to do about the sample size being 10. So if we increase the sample size to 25, for instance, you will have a larger number of sample points in the sample, but what you will see is that as soon as the sample changes, the sample estimates change, the population parameters do not. Now from the experiment, we found out the meanings of several terms that we were using in point estimation. For example, we talked about a population, a population is a collection of all elements of interest. From the population we were getting a small subset of values called a sample. Now, if every element in the population had an equal chance of appearing in a sample. Then, the sample that you have is called a random sample. Now there are certain characteristics of a population, for example, the population mean the population standard deviation, the population variances, proportions of a population having a particular value, and these things are known as parameters. Now, for each of these parameters, we have certain characteristics of a sample, a sample statistic that may be used to take a guess at these parameters. These sample statistics are called estimators. Now, if we have an estimator and apply that estimator to a particular sample, we get a numerical value which is called the estimate. Now, so the estimate is guesses at a population parameter. Now, since the population does not change the population parameter is a constant. However, since we do not have access to all the data in the population, the population parameter is also unknown. A sample estimate, on the other hand, is known and it varies with the sample that is generated. So in our experiment you saw that when we generate different samples, the values of the estimate change. Now, since each value is just a single number, these estimates that we generate are called point estimates. [MUSIC]