So once upon a time back in 1906, an intriguing contest occurred on farm in Plymouth, UK. And the idea of this contest was actually to estimate the weight of an ox. So this ox was placed up on display and the, a bunch of villagers had to guess the weight of the ox that was put on display. So the way this worked is that the villagers would come up, and they would get a good look at the ox and then they would make a guess, write it down on a sheet of paper, and maybe put it into some, some bin or something. They would, they would put it in, submit their guesses in here. And it was very important also, that they did not look at what each other, had guessed so that there was no you couldn't copy anyone else. So you looked at the ox, you formed an opinion of what of, of what its weight was, and you submitted your guess. And there were 787 participants. So 787 villagers who participated in this experiment. So Sir Francis Galton analyzed these results, and he wanted to see what he could come up with. And that one person guessed the true weight of the ox which was 1,198 pounds. So, so no one guessed the true weight. But the average of the guesses was 1,197 pounds which is an error of only less than 0.1%. So, very, very small error in this guess. So you're probably wondering, well, how could that be, right? Because nobody was anywhere near the true guess. They were all over the place. And some people guessed way too high. Some people guessed way too low. But when we average them, we get something really close, to what the true value was. So as it turns out there were several factors at work here, that played into favor of this whole guessing game and averaging of the guesses, actually coming out to be something, very meaningful and very accurate. The first one is that the task is relatively easy, right? So you just, you look at the ox and you guess the weight and you put it in, right? It's a very objective task. The second is that the estimates were independent and unbiased. So independent was the, basically the idea the way of that is made, is that nobody was able to look at any, what anybody else submitted. So everybody does their task entirely independently, and it's not, so what you guess is not going to be dependent upon what anybody else guesses. Then unbiased means that there was no systematic tendency for anyone to guess, or for everyone to guess higher or lower, right? And the way to create that in this situation would be something like, if somebody wanted the true, the people to guess too high or too low and they added some, some decoration on top of the ox and made it look like it was heavier than it was, then everybody might guess too high. So that's the idea. So, if we draw this out here, let's just put everything on a number line. And we say that the true value whatever this is in general, this guessing game is right here at the center. And say this is something like 100. So the idea is that if I submit a guess maybe I'll say 90, for whatever it is. Maybe then you submit a guess and you say 120, or maybe you're smarter than I am. So I guess 120 which is 20 off and you guessed 90 which is only ten off. Someone else comes in makes a guess here. Someone else makes a guess here. Someone makes a guess, guess more and more. The idea is that even though they're far off, they're going to concentrate around this center point. So the concentration's going to be around here, and when we average them we're going to get something close to that center. So if they were biased just so you understand what bias is, right? This is the true mean. So there would have to be some systematic tendency to guess higher than this, so that everybody would guess higher. And, if it was biased then that would mean that this distribution was actually centered up around here. So this would be a bias. So everybody, instead of, you guessing 120, or instead of me guessing 120, I would have guessed, 180 or something and, you would've guessed, the true value was being 140. So, there, there aren't values below and above. There's only values above. And similarly if there was a bias below, there would only be values below or they would concentrate around the center, center below. The third is that there were enough people participating here. So 787 participants is enough to get a, a good average in this case for this experiment. And as we'll see the number of people that need to participate is going to depend upon really how difficult the task is. And also how independent and how unbiased the estimates are. So returning now to Amazon, our hope is that when we average customer ratings for a product, the result is going to get close to the right rating. But can we even say that such a true even exists? And doesn't this depend at least somewhat on the specific customer? So, let's look at the factors that we just identified as being important in Galton's experiment in the context of Amazon reviews. First is the definition of task. So, guessing numbers is a pretty easy task to do. Guessing the weight of an ox is a straight-forward task. Reviewing a product on Amazon, answering a rating is a little more difficult. The second is the independence of the reviews. Success in opinion aggregation stems not from having many smart individuals in a crowd who are likely to guess correctly. Rather, it's from the independent each individual's view from the rest, as we talked about before. My review isn't dependent upon yours. I form my own opinion. So the question is, are Amazon reviews independent of each other? And they kind of are, not, not entirely though. Because you could go on and look at the previous reviews of a product and read through them before you submit your own. But given that you've tried out the product on yourself, you're probably going to form somewhat of you own opinion at least and submit somewhat of an independent review. Third one is the review population. So Galton's experiment would not have worked as well as we said if there had not been 787, 800 people participating. And in general, the harder the task is and the less independent the reviews are, the less unbiased and the less independent they are from one another. The more people we're going to need in order to get a good estimator from the average, or from the collective guess at the population. And so, those factors are all going to come into play, here. So, wisdom of crowds we can actually summarize this, what we've been saying, mathematically. The idea, intuitively again, is that when we take the opinion of a massive crowd it's going to be greater than the opinion of any one single person. So, what it says is that the error in the average, so if we take a number of guesses, so for instance, this is for these five people here. We take their guesses, and we take the average of their guesses. The error that that average is going to have, roughly speaking, is going to be the error that each guest has, you know, assuming that they each have the same, not, not the same error, but the same distribution in their error divided by the number of people. So it's going to lower by the number of people that are in that estimate. So if there's five people, it's going to be five lower than the error that each guess has on average. So when we take the average, we're going to start to see less and less guess. And the reason that we use an approximation symbol here is because naturally not every guess is going to have the same error. And in fact, if they all had the same exact error this wouldn't help at all because then they would all be biased up. That's would actually be a bias condition there and it wouldn't help at all then. What we need, we need all these errors to be able to cancel each other out, but if we take the expected value of the error we say statistically, divided by the number of people, it's going to have the expected error in the average. So, very important thing though, is that this is only going to hold if the guesses are independent and unbiased. And the reason that, that needs to hold this because these errors intuitively need to cancel each other out. Again, if my, if the true value is over here, right, we can't all be guessing up here, right? Or else, when we average out we're going to be nowhere close to the mean. We have to some higher, some lower than the average, and so forth. And we have to have them not being dependent upon one another either. So, independence and unbias are absolutely essential for the wisdom of crowds to hold up. So just as the example of application of this equation suppose we have one person, right, and his guess is three. So the average here, is going to be three divided by one, which is equal to three, because there's only one person. Sorry, three. And suppose that the error, is one. So we expect him to be too high by one. Now, suppose we instead have five people, and they submit guesses of three, two, three, four and four. So there's five people here, down the line. And they submit their guesses. So if we take the average of this, we're going to have three plus two plus three plus four plus four, divided by five. And that comes out to be 3.2. And the error here, the important part, is the error of one person, divided by the total number of people, which was five. So, the error, when we have five people, produces from one in this case to .2. So now, suppose we have a large number of reviewers for a product on Amazon. Does this discussion imply that the average rating is going to be close to the truth that we want? Well, not necessarily, because as we pointed out, they're many complications, right? The task is not exactly easy. There's not exact independence assumption, and their view population may not be large enough. And so let's explore a few of these challenges next.