The purpose of this lecture is going to be to try and explain the difference between recommended systems versus other types of machine learning. So we've seen things like regression and classification in previous lectures and previous parts of the specialization. Really we're just trying to make clear how those types of machine learning are fundamentally different from what recommender systems are doing. And finally we'll introduce two common types of recommender systems that we'll develop throughout the rest of this course. Okay, so just to give a motivating example, imagine we wanted to build something like a movie recommendation system that said, which of these movies would I rate the highest? Let's estimate some function that says what rating would I give to each movie? Okay, so this is what we'd like. We'd like a function that goes from some features describing the user and some features describing the movie and tries to estimate a star rating. This seems very similar to a regression problem. And we think maybe we already have some tools in our supervised learning toolbox that might help us to solve this problem using techniques we already know. Okay, so let's imagine trying to do this using something like regression, since it looks like a regression problem. Well, we would first extract features about the movie. That could be things like which actors are in the movie, which could be a white hot encoding. What's its rating, its MPAA rating? In other words, is it PG? Is it G, etc.? What is the length of the movie? What is the budget of the movie? So on and so forth. And similarly we could extract a bunch of user features, what's the user's age, what's the user's gender, what the user's location etc. So all of these features could somehow be correlated with the way people rate movies. So it seems reasonable to try and build a regression model that would use them to estimate the star rating. Okay, so if we built a predictor of this form, we can account for all kinds of things, like do women give higher ratings than men? Do Americans give higher ratings that Australians? Do people give higher ratings to action? Do they give higher ratings in summer or winter? Do they just like movies with a particular actor in them? So on and so forth. All of this seems perfectly reasonable and it's something we can do with a kind of regression model we've developed so far. The question is what's really missing? What can't we do yet with this type of model? Okay, this is maybe kind of a subtle point. What we've really been developing here is a linear predictor just like in course one. Where we would be extracting a bunch of user features, a bunch of movie features. We'd be concatenating them together. That's the second equation from the bottom there. And multiplying them by some parameter vector, theta. Okay, this is a linear model and by the definition of a linear model, we can rewrite that equation just like this equation on the bottom. So really it's like having one model. We can see it as the user features, plus one model that uses the movie features. So we have one component of theta which corresponds to the user parameters, one component of theta which corresponds to the movie parameters. Okay, so we've essentially taken our model and just rewritten it in two parts. And this is just to summarize that here again, we have a user predictor and a movie predictor. And really, this is saying the problem can break down into two separate predictors, which is a very funny thing because what it basically says is that the user features and the movie features are somehow independent of each other. Which, again, maybe is kind of subtle, but that's not what we really want. So if we think about what's happened, is we've broken down our function. It takes user features and movie features into two functions, one describing the user and one describing the movie. So somehow the function of the users is estimating am I the type of user who tends to get high or tends to give low ratings? And somehow the function of the movie is saying, is this the type of movie that people just enjoys? Does it have the right budget? Does it have the right length? Does it belong with the correct genre? Okay, what is totally lacking, though, is any feature which says, do I tend to give high ratings to this genre of movie? So recommender systems go beyond this a little bit by trying to model relationships between people and the items they're evaluating. So, one example of how a recommender system might work is by estimating things like a user's preference towards specific aspects of a movie. Like what's my preference towards special effects? Or how much do I like action? And correspondingly, we can estimate the properties of different movies like how much does this model exhibit good special effects. And how much did this movie exhibit a lot of action? And then we try to estimate the compatibility between those things. This is very different to the type of model I presented previously, where we had one term describing the user, and one term describing so the movie. What's critical is in that case, if I tried to recommend a movie to a user, I'd say which movie are you likely to rate the highest? That movie would be the same for everyone. Whichever movie had the best features, had the best length, the best budget, the best MPAA rating, the best actors in it, that movie will be recommended to everyone. There's no possibility of personalization when we break a model into two independent components. So this notion of compatibility, which is exactly what a real recommended system is trying to capture the personalized relationship between the user in the movie that is exactly what's going to allow us to make different recommendations for different users. So is this movie compatible with me personally, in spite of the various attributes that just make it good or bad overall. Okay, and later on in this course, we're going to look at two common types of recommender systems. There's these two paradigms that people frequently use. One is recommender systems that discover which type of items are similar to each other or which users are similar to each other. This is not so much a machine learning based recommender system, but this is trying to discover common patterns among people's purchasing or their rating behavior. The second thing we look at is the machine learning based approach, which is maybe more closely related to other concepts we've seen, in the specialization which would call model based recommended systems. So the difference between these two paradigms is as follows. Similarity-based recommender systems are somehow trying to measure similarity between items, or similarity between users. In this case, we estimate the similarity between items in terms of the users who have purchased them. So an example of this type of recommender system would be something like people who bought x also bought y, as you see on Amazon. Somehow they're estimating the similarity between those two items x and y, or that pair of jeans and that shirt, in terms of the users who purchased or co-purchased both items. In contrast, model-based recommender systems are using machine learning to estimate specific outcomes. So for example, this rating prediction task that I previously introduced from Netflix is somehow solving something like a regression problem in order to make a personalized estimate using some form of supervised learning. Okay, so all we've done in this lecture is introduce two classes of recommender systems that we'll study in the future. And we've really characterized why recommender systems might be different from other forms of supervised learning.