[MUSIC] So let me give you an example that comes from Wheaton that's used pretty ubiquitously to explain some of the concepts of machine learning. And we won't use this example too much, but it is useful for a variety of purposes. So imagine you're trying to predict when someone is gonna play golf or play tennis or do some other outdoor activity. And we imagine that it's a function of the weather. And so we take in inputs like the outlook and whether it's sunny or overcast or rainy, the temperature, whether it's hot, cool, mild, the humidity, how windy it is. And we wanna learn a function that takes in those inputs and produces a yes or no answer or whether or not we play, say golf. Okay, and so the simplest version of this you might do just intuitively is just try to predict it with a simple rule. So you might say, look, maybe we only play when it's sunny. You know, is this true or not? Well, no, it's not true because we can find examples where we did indeed play when it was overcast. Okay, you might say well maybe we don't play if it's rainy and windy. And that turns out to be true, but there's also other times that we don't play as well. If a function exists that can predict this output well, it's perhaps not something that we can express in just one simple rule, which is kinda the limit of what we can do mentally just through intuition. And so the idea is we need some principled way of coming up with more complex models when they're needed, okay. So fine, so some terminology. This problem is an example of a classification problem where there's a learned attribute and it's categorical. It can take two values in this case, zero or one. If the learned attribute is more of a numeric value, for example, what our score in golf was, as a function of the weather, if you want to try to predict that, that would be more of a regression case, okay. And really not just learned attributes, probably all the attributes are numeric, it becomes more of a classical regression case, although there's ways to handle this. And so we've talked about regression in terms of fitting a curve to data, but when you use that fit curve to actually make predictions, that's when you can see that it's analogous to the classification problem over categorical data, okay. So more terminology, the golf example we just gave is an example of supervised learning, where you're given examples of inputs and the desired outputs, and we're trying to learn the relationship between them. So we train a model to do this. And there's also an issue of unsupervised learning, which is you're not given any labels whatsoever, you're just given the data and you're trying to understand the underlying structure in that data. Okay, so this is things like clustering algorithms or dimension reduction algorithms if you're heard those terms before. So let me give you an example that can be sort of cast as either supervised learning or unsupervised learning, but it's instructive as an unsupervised learning case for right now. So imagine you're just given a big corpus of documents, and you wanna make a prediction of what topic a given document pertains to. And so if you see the phrase the Falcons trounced the Saints on Sunday, you might guess that this document is about sports while if you see that the Mars Rover discovered organic molecules on Sunday, you might guess that this document is about science. And so how do you set this problem up as as machine learning problem? What are the rows and columns? We just had a example where there are sorta rows and columns and we said for one record we're trying to predict the output given the inputs. What are the rows and columns here? Well, this shouldn't be too bad since you've seen this before in a couple of assignments now. >> [MUSIC]