[SOUND] And here we're going to talk about basic strategy. And that would be based on similarity of users and then predicting the rating of and object by an active user using the ratings of similar users to this active user. This is called a memory based approach because it's a little bit similar to storing all the user information and when we are considering a particular user we going to try to retrieve the rating users or the similar users to this user case. And then try to use this information about those users to predict the preference of this user. So here is the general idea and we use some notations here, so x sub i j denotes the rating of object o j by user u i and n sub i is average rating of object by this user. So this n i is needed because we would like to normalize the ratings of objects by this user. So how do you do normalization? Well, we're going to just subtract the average rating from all the ratings. Now, this is to normalize these ratings so that the ratings from different users would be comparable. Because some users might be more generous, and they generally give more high ratings but some others might be more critical so their ratings cannot be directly compared with each other or aggregate them together. So we need to do this normalization. Another prediction of the rating on the item by another user or active user, u sub a here can be based on the average ratings of similar users. So the user u sub a is the user that we are interested in recommending items to. And we now are interested in recommending this o sub j. So we're interested in knowing how likely this user will like this object. How do we know that? Where the idea here is to look at whether similar users to this user have liked this object. So mathematically this is to say well the predicted the rating of this user on this app object, user a on object o j is basically combination of the normalized ratings of different users, and in fact here, we're taking a sum over all the users. But not all users contribute equally to the average, and this is conjured by the weights. So this weight controls the inference of the user on the prediction. And of course, naturally this weight should be related to the similarity between ua and this particular user, ui. The more similar they are, then the more contribution user ui can make in predicting the preference of ua. So, the formula is extremely simple. You can see, it's a sum of all the possible users. And inside the sum we have their ratings, well, their normalized ratings as I just explained. The ratings need to be normalized in order to be comparable with each other. And then these ratings are weighted by their similarity. So you can imagine w of a and i is just a similarity of user a and user i. Now what's k here? Well k is simply a normalizer. It's just one over the sum of all the weights, over all the users. So this means, basically, if you consider the weight here together with k, and we have coefficients of weight that will sum to one for all the users. And it's just a normalization strategy so that you get this predictor rating in the same range as these ratings that we used to make the prediction. Right? So this is basically the main idea of memory-based approaches for collaborative filtering. Once we make this prediction, we also would like to map back through the rating that the user would actually make, and this is to further add the mean rating or average rating of this user u sub a to the predicted value. This would recover a meaningful rating for this user. So if this user is generous, then the average it would be is somewhat high, and when we add that the rating will be adjusted to our relatively high rate. Now when you recommend an item to a user this actually doesn't really matter, because you are interested in basically the normalized reading, that's more meaningful. But when they evaluate these rather than filter approaches, they typically assume that actual ratings of the user on these objects to be unknown and then you do the prediction and then you compare the predicted ratings with their actual ratings. So, you do have access to the actual ratings. But, then you pretend that you don't know, and then you compare your systems predictions with the actual ratings. In that case, obviously, the systems prediction would be adjusted to match the actual ratings of the user and this is what's happening here basically. Okay so this is the memory based approach. Now, of course, if you look at the formula, if you want to write the program to implement it, you still face the problem of determining what is this w function? Once you know the w function, then the formula is very easy to implement. So, indeed, there are many different ways to compute this function or this weight, w, and specific approaches generally differ in how this is computed. So here are some possibilities and you can imagine there are many other possibilities. One popular approach is we use the Pearson correlation coefficient. This would be a sum over commonly rated items. And the formula is a standard appears in correlation coefficient formula as shown here. So this basically measures whether the two users tended to all give higher ratings to similar items or lower ratings to similar items. Another measure is the cosine measure, and this is going to treat the rating vectors as vectors in the vector space. And then, we're going to measure the angle and compute the cosine of the angle of the two vectors. And this measure has been using the vector space model for retrieval, as well. So as you can imagine there are just as many different ways of doing that. In all these cases, note that the user's similarity is based on their preferences on items and we did not actually use any content information of these items. It didn't matter these items are, they can be movies, they can be books, they can be products, they can be text documents which has been cabled the content and so this allows such approach to be applied to a wide range of problems. Now in some newer approaches of course, we would like to use more information about the user. Clearly, we know more about the user, not just these preferences on these items. So in the actual filtering system, is in collaborative filtering, we could also combine that with content based filtering. We could use more context information, and those are all interesting approaches that people are just starting, and there are new approaches proposed. But, this memory based approach has been shown to work reasonably well, and it's easy to implement in practical applications this could be a starting point to see if the strategy works well for your application. So, there are some obvious ways to also improve this approach and mainly we would like to improve the user similarity measure. And there are some practical issues we deal with here as well. So for example, there will be a lot of missing values. What do you do with them? Well, you can set them to default values or the average ratings of the user. And that would be a simple solution. But there are advanced approaches that can actually try to predict those missing values, and then use predictive values to improve the similarity. So in fact that the memory based apology can predict those missing values, right? So you get you have iterative approach where you first use some preliminary prediction and then you can use the predictive values to further improve the similarity function. So this is a heuristic way to solve the problem. And the strategy obviously would affect the performance of claritative filtering just like any other heuristics would improve these similarity functions. Another idea which is actually very similar to the idea of IDF that we have seen in text search is called a Inverse User Frequency or IUF. Now here the idea is to look at where the two users share similar ratings. If the item is a popular item that has been viewed by many people and seen [INAUDIBLE] to people interested in this item may not be so interesting but if it's a rare item, it has not been viewed by many users. But these two users deal with this item and they give similar ratings. And, that says more about their similarity. It's kind of to emphasize more on similarity on items that are not viewed by many users. [MUSIC]