Now let's go through two examples on Amazon. The first one will illustrate the point that Bayesian ranking can change the order compared to a simple Naive rating. And this is a data from some day on 2012 on Amazon with five different kinds of MacBooks. And in this table we list the number of ratings, the Naive simple average of these ratings, and the resulting rank. And our goal is to compute the Bayesian ranking and the resulting Bayesian rank. Okay so first of all we see that the total numbers of reviews, N in this case, add these up, is 227. And the NR, which equals niri, this is each of the five products number ratings, and ri is each of five products average ratings. That turns out to be, we can just multiply and then add up the five numbers to be 2332.752, okay? So, the average rating therefore is about 4.42. If you look at these five MacBooks without computing any Bayesian numbers yet, you can see that this one's ranked number one by simple average. And this one's number two, but each got only 10 or 15 ratings. Now we can go into the distribution, into the time series analysis and all that we mentioned at the beginning of the lecture. But the material results we have really is just on Bayesian adjustment. But just some Bayesian adjustment already see that maybe they shouldn't be ranked so high even though this is a very high score 4.9 but it's only based on 10 ratings. Whereas this 4.5 stars is based on a much larger population. So maybe this should be ranked higher than this? Will that happen? I don't know. We'll have to carry out the calculation. We'll also see that these two average ratings are actually below the overall average of these five products in the same category. Whereas the other three are above it. The implication will come back in a couple of minutes. So, let's compute, for example, r12d. This is the Bayesian adjusted average rating for product 1. That would be 2332, n times r, plus 10 times 4.92, okay. 10 times 4.92. 10 is N1, 4.92 is R1. Divided by N which is 527 total number reviews. Plus ni, the number of years for just this product which is 10. Okay, so clearly this 10 times 4.92 is one of the five components of this term and 10 is one of the five component of this term. And the resulting number is 4.436 which is much lower than 4.92 because relative to the total number of ratings. 10 is a small number, so the Bayesian adjusted average is 4.436. We calculate the same formula for the other four products, we get these four numbers. You can look at the comparison before and after the Bayesian adjustment. This one got almost half of a point down. And this one also drifted downwards. This one drifted downwards a little bit, and this one actually drifted upwards and so is this one a little bit. Now the Bayesian ranking turns out to be this MacBook is number one, then this is number two, then this is number three. These two are basically tied. The difference is very small basically. Tied at four, five. If you look at this ranking compared to this ranking, you see that exactly different for every single product. One becomes two, two becomes three, three becomes one, four becomes five, five become four. In this particular example which is a little extreme, the Bayesian ranking actually completely changed the ranking order. Not single product out of five remains the same position. But it's intuitively making a lot of sense. In particular you see these Bayesian scores are a lot closer than these because the larger number of reviews helps to boost these. And these few number of review, you have bring these numbers down. We also see that this 4.5 star with a lot of rating is the best combination to be number one. We also see that this one, because the average rating's so high, so even though the average number rating is small, still it retain a relatively high position, number two. And this number three. Last thing we want to observe here is that, you see, how come that this product which has both a slightly larger average rating and a slightly larger number rating actually becomes worse than this? So I do not just compare this product with this product. How could that have happened? Well, the main thing is that because both of their average rating is below the big R 4.42. So for those kind of product, if you actually have a fewer number of ratings, then we trust the average more than your own average. That's why it put up closer than this product to the total average 4.42. So there's a lot of interesting messages just out of this simple example. Now let's move on to a larger example. Now our goal is to reverse engineer what Amazon does. Again Amazon shows you the average number of stars, a 4.5 star and number of ratings and you can try to make sense out of this yourself. Everybody in our mind were implicitly comparing this. In some scale, we know for certain product 121 is a lot of ratings, and therefore this is trustworthy. And we know for a certain product, 121 is not enough. So we have some implicit scale that we can use. Now, Amazon also gives you a ranking not just the rating but also a ranking of say similar product. Let's say HD TV LCD TV, 30 to 34 inches. Okay, if you go there as we did a few weeks ago. In 2012, you see that it gives you a ranking of say top 20 tv products. And you can actually rank that top 20 by several different criteria. One criteria is the so called average customer review. And you think that this order, therefore, should be exactly following this average number of stars. But actually it does not. I will share an example in a minute. So it must have done something else. What is this something else? That's the secret formula that Amazon does not reveal to the external world. But there must be some format behind it and we believe that one, it must have taken to account the number of reviews, in other words, Bayesian adjustment has been done. Two, the recency of review also matters. And third, reputation score or in general, the quality of the review. One aspects of quality review is the reputation score of the reviewer. In fact, Amazon also keeps another ranking, a ranking of people, a ranking of reviewers. Some people actually try very hard to get to the hall of fame, and be included in the list of top reviewers. In fact, there is a voting formula. It has changed about three times over the last 15 years, and there is interesting blogs just talking about the voting formula hidden behind the ranking of reviewers. But anyway, reputation score reviewer is one thing. Quality of the review based on natural language processing, so a keyword is another review of a review is another and even timing of the reviews with time series machine learning will also help. So all these collectively points towards not just the quantity but also the quality of the reviews. So quantity, quality and timing of the reviews, these three factors must have factored in. But since we don't know the exact formula, we're going to try to reverse engineer that. Now this is of course assuming that there are no other factors such as trying to finish selling something stocked up, and it not selling well, and thereby just artificially boosting to the opposition. There are insider rumors spreading around that says that maybe that has never been practiced in Amazon. So we have some level of confidence that there is a more objective formula. All right, so our goal in the next five minutes is to try to discover that. And this is our example. It's actually how CDHDTV 32, 34 inches. We did not write down the actual manufacturer of the TV because that doesn't concern us here. Which is write down the list of 20, okay? From 1 up to 20, top 20, ranked by Amazon according to customer review. But if you look at the customer review, which are these numbers, you see that they are not in descending order. Okay, clearly there's something else going on, even though it is according to the customer review as is claimed. For example the number of reviews by a Bayesian adjustment. We write in red the number of reviews here. Okay, you can see some product only got seven or eight reviews. Some got 249 reviews. Okay, if you look at this table, compared to last example of MacBooks, it makes some sense. For example, whatever has a high Naive average number of stars and a reasonable number of reviews is ranked high. But there are also quite a few outliers, including those circled in black color. This, why is it ranked number 6 if this only 4.2? Why is this number ranked number 12, 15 and 20? These three products, even though the number of their reviews are actually very similar, okay? This one, you may see why because it's got a small number of reviews and a small ranking score. But 12, 15, and 20 almost get the same number of reviews but they're spread very widely in the final ranking order list. Also, why is number 14 ranked so low? It's got a reasonable number of stars, and very large review population. Shouldn't it actually be higher than position 14? So there are quite a few outliers and mysteries. The first one we want to resolve is the Bayesian analysis. So the formula Nr + niri over NR. Now, we know the ni's are i's. And we know the total average R's. Of course we'll also know the total number of reviews, but as we mentioned before, in the last segment of the video, this sometimes is N min. Okay? Sometimes this N is N max, which is the largest number reviews, sometimes it's N average, which is the average number of reviews received by a product out of these 20 TVs. So we don't know whether we should use N, the total number or N min which is what, 8 or 7. Or N max, which is 315. Or N average, which is 99. And this total is 1986. Which vector shall we use as this N in the formula of Bayesian adjustment? Well, let's try all of them. So we did try all of them and look at the resulting ranking. It turns out that, both this and this get very close to the actual ranking by Amazon, this ranking. So we can take these numbers to Bayesian adjustment in four different ways according to what we use in this expression. And we see that this could have been the formula and this could have been the formula. And that makes sense because these two are too extreme, perhaps. So some are between 100 and 300, some are between average number review per product and the the maximum number of reviews per product is used in the Bayesian adjustment. But also, we have to look at the quality of reviews. And that helps explain quite a few mysteries. For example, product 12. The most helpful reviews got 26 people finding it very useful. Which is a lot more than product 15 or 20, so that's why 12 is ranked quite high. Or, product number 6, ranked number 6 by Amazon. The most helpful review, it has 139 out of 144 people saying that it's helpful. So the binary count of thumbs up and down is very high, very favorable to the reviews. So people may trust the reviews much higher. And this explains also the position of number 6. So let's take a look at 12 and 6. Okay there's 6, that's why even though the score is only 4.2, it's positioned quite high, okay, and we're going to now look at number 14 here, okay. Why is this ranked kind of low? Well it turns out if you look at those reviews, first of all, the most helpful review was from 2010, and that's a little outdated. This is for product 14, okay? And there are 8 reviews who give it 1 star only. So the distribution has a very negative component and the review text says that the TV actually stopped working. They don't work after a few months. For electronic product, that's a major, major complaint. So by looking at the quality of the reviews in different ways, we see why 12 and 6 are ranked where they are and why 14 is ranked where it is ranked. And then it's to the planning, the recency of the reviews. We noticed that for example, product 15. There's a December 2011 which is very close to the data. We pull this data off Amazon. That's very recent and as to product 14 we just mentioned, even though it's got a large number of reviews and high scores, the most helpful review was actually still from 2010, which for electronic product may be one generation earlier. So this whole set of explanation helped to explain why there were at least mysterious outliers. So in summary, the key factors, we believe, behind Amazon ranking Includes, of course, the Naive average review rating score. But also the following factors: includes Bayesian ranking, probably using the formula of Bayesian ranking where N is picked as an average. It includes the penalty on too few or too outdated reviews, depending on the product category. And one year could be too outdated. It helps a lot to have very high quality reviews. A lot of people review the reviewers very high quality or high reputation reviewers review in the product that helps a lot. And if there are major issues, rating of one with a consistent pattern of evidence backing up that rating will push the rating down, a lot. Now this ii not complete science, yet it's only based on one example, but it does illustrate some of the factors that goes into the actual rating on Amazon. And partially help to answer our motivating question. When can you trust an average rating on Amazon? And the simplest answer perhaps is that, you got to look at the number of ratings, too. In this theme of average ratings scalarizing a vector and their ranks will carry us continue us to the next lecture on voting. There's still a lot of things that can be done with signal processing statistical tools for rating reliability on Amazon and other online venues. One key factor that we brought up is this factor of N. It is one of the two gains. It's called multiplexing gain In wisdom of crowds. This just a fuzzy English phrase. So one particular quantification in an unambiguous language is this Factor of N multiplexing gain. From what we saw in example is that it follows from independent and unbiased individual inputs. They can be wrong, but as long as they're wrong in dependent ways, and for large enough sample size, we'll see this factor of N gain and we'll later see other types of wisdom of crowds.