In this lecture, we're going to show how we can adapt the code developed in the previous lecture for similarity-based recommendation to make a model that can actually predict ratings again based on similarity function, and we'll provide code examples to do so. Okay. So what we saw in the previous lecture was code to make recommendations for items to other items based on the Jaccard similarity, which items are similar to a given query item? The question is, in this lecture, how can we use the same ideas to make an effective system for rating prediction rather than this item recommendations? Given a user and an item, can we predict what rating that user would give to that item? So here's a simple heuristic for rating prediction based on a similarity-based recommender with my work. Okay. It's going to work as follows. We are going to take a user u's rating for an item i, as a weighted combination of all of the ratings of previous items. So we have some vector that says, how did the user rate every single other item that have rated j? But those ratings will be weighted based on how similar those items are to the given query. So items that are more similar to i will be given a higher weight compared to items that are less similar to i. It's really saying, how did you rate things that are similar to the query item and we're going to use that to predict the rating for the query? Okay. So we can rewrite this as an equation which might make it a little bit clearer. So given a particular user u and an item i, for which we'd like to compute the rating or estimate the rating, that's the function r u,i. To do this, we're going to sum over all items j to the user previously conceived that the summation is equation. So we're summing over all items in the item set for u, other than the query item for i. What we do inside that summation? Well, we take the user's rating for those items that's r_u, j and we weight it by the similarity of j with the query item, that could be the Jaccard similarity or any other similarity function you like. This is really a weighted sum of all the previous ratings that the user gave to other items. The weighting function is just a similarity. Okay. That gives us a weighted sum, we don't really want that, we want to weighted average in order to estimate the rating. The final part of this equation is that normalization constant Z or Z which is just the sum of all of the weights or the sum of all the similarities. That gives us a weighted average rather than a weighted sum. Okay. So finally, let's see how to implement that using code without our previous recommendation code to predict writings in other words. So mostly, we have the same data structures, we now need to actually store the individual ratings not just the user IDs and the item IDs. So to do that, we're going to add these two utility data structures, one which stores all of the reviews for a given user and all the reviews for given item, which is the list for each user and a list for each item. So iterate through our data set, extract the user and item IDs, and append to the corresponding reviews to the user and item lists. Okay. We'll also compute the rating mean, which is just going to be for the sake of baseline comparison to see whether this predictive algorithm to estimate the ratings is going to do better than just predicting the mean all the time, in terms of the mean squared error. Okay. So I ready prediction code is going to work as follows. We try to implement some function that predicts the rating given a particular user and a particular item, what rating will that user give to that item? What we're going to do is build this two lists, one of the list of ratings that the user has given the previous items, and the other is the list of Jaccard similarities we're going to use to compute a weighted average of those ratings. Other than that, we're implementing exactly that function as described in the previous slide. So we're going to iterate through all of the items that the user has consumed. If the item is equal to the query item, we'll skip that one and we'll continue, otherwise, we will take star rating that user gave to the previous item, that's our list of ratings and we'll take the Jaccard similarity, that's our list of similarities that we'll use to weight the ratings. Okay. Now, I've computed those two lists, those ratings and the list of similarities. In a typical case, if that list is nonempty, we'll compute this weighted average. So all we're doing is running here a list comprehension that says, what is the rating times the Jaccard similarity for each of the corresponding elements in those two lists? Then we divide that sum by the length to get the average. Okay. So there is one other case. So if the user has not rated any similar items, then this weighted average won't make any sense, so we'll just return the rating mean in the default case. Okay. So let's show an example of this code actually walking. We'll take some elements of our dataset, we'll extract the user and the item ID, and we'll run this predict rating function and see what we get out of it. So we get some reasonable seeming number. To be a bit more rigorous, let's actually try and evaluate the accuracy of this rating prediction system across our entire corpus. Let's write down this function for the mean squared error. It's pretty familiar function by now. We just take all of our square differences and we return the average of them. The first thing we want to do here is built on function that just always predicts the mean value which is our baseline we're trying to improve upon. Next, we'd like to build our model that uses the code we previously wrote to make predictions. In both cases, we'd like to compare those to the ground truth labels. All right. So after that, we just compute to mean squared errors, one of the simple baseline that always predicts the mean, and one of the similarity-based prediction. Okay. Now, this is the simple heuristic to predict ratings. In this case, it actually did worse in terms of the mean squared error than always predicting the mean, so this idea of using a weighted average did not seem to have been a good one in this particular dataset. So we could try and improve this function in various ways. In this case, we use the Jaccard similarity. Maybe that was a poor choice or relied upon incorrect assumptions in the case of this particular dataset, maybe we could do better by using the cosine similarity, for example. Or we can do exactly the same thing based on similarity function that looked at users rather than items. So we did a weighted average over all of the ratings that some user had given the other items. We could also compute a weighted average based on all users who have rated a particular item. We will just take that function and essentially interchange the role of users and items. We could also use a different weighting scheme rather than a simple weighted average. So like I said, it's a heuristic and in many parts of this function, we could change, to try and get better performance on a particular dataset. All right. So in this lecture, we showed how to build a similarity-based recommenders and to convert that into a recommended that can actually predict ratings, are provided an implementation of this idea. So on your own, I would try to adapt the code to implement some of those modifications I recommended, say changing the similarity function to a cosine similarity, and seeing whether you can actually improve the performance of the system by using different implementation details.