So let's lets rap up our our results here and talk a little bit about what we've

seen. first like to graph the RMSE with

different methods that we have used in a few others ones that we didn't.

so here is the RMSE in the blue is the training data, in the red is the test

data. So you can see first is that just simple

average predictor, where we had 3.5 for all the entries.

And this is the baseline predictor, so you can see the improvement in both the

training and the test data down for the baseline predictor.

So, it's a huge drop when we actually incorporated that.

and then the neighborhood method, which was one.

which is what we, we had done. And so again, you see another drop there.

And then a few other things that we didn't look at any of these other ones,

these are the three that we did. We did the simple average, then we did

the base line and the first neighborhood. but a few things that we didn't explore.

First is that we could have used users as neighbors, and I don't have a graph of

that here but it's really the same thing. We just draw similarities along the rows

instead of along the columns. and another thing is we could have used a

neighborhood method with multiple neighbors.

And, you know, there's no telling which, how many neighbors is the correct one.

One way to do it is you'd run it for two different tries and then see which one

gives you the lowest RMSE on just your training data because remember you can't

look at your test data until you're all done.

but, and then you can see which one gives you the lowest on the training data but

again there's still no guarantee that it's going to be the lowest on the test

data. but this is showing if we had used two

neighbors, actually what we would have gotten, and the math gets a little more

complicated there. but you see it's a slight improvement

over the neighborhood method with one neighbor, but actually the with the test

data it actually jumps up, a little bit. so that's really interesting.

Now another thing is that we said that we could optimize the baseline values.

Remember we just took the average in the rows and columns and subtracted from the

overall average 3.5. but really we can, we can actually solve

an optimization problem to get the baseline lower and now we're lower and if

we had done that we would have gotten 0.397.

And 0.6311, and those were lower than these values over here which is what we

compare them to, these guys. And then if we had done the opt, the

neighborhood method with these baseline values, right, we would of dropped even

lower. And then you would actually see how the

neighborhood method with two neighbors is yeah it gets, gives you lower values in

both case than the neighborhood method with just one neighbor.

And so interesting to see how different decisions along the way, whether or not

you're going to optimize, can determine whether or not things are going to be

better or worse for you. So again you just go by intuition, what

we think is going to work. And we go with that.

And so, these are, these are really, I just want to emphasize, these are just

the key ideas, I mean, there is so much more to this.

It's such a vibrant and abundant field, in machine learning and data mining.

And there's courses, on courses that are specifically on machine learning, for

instance. And those are all really interesting

fields. And you know, in a future version of this

course, maybe we'll go into more depth with some of this stuff.

but this just sort of grazes the landscape, so to speak.

And you should have the key ideas now what Netflix does to recommend movies for

their users. So they exploit baseline and neighborhood

predictors as portions of it.