We just began our introductory look at machine learning. In this video, we'll continue with our weather example to describe how machine learning can be used to complete this task. Remember from our last video that machine learning is computers learning to perform tasks without being explicitly programmed to do so. You might be thinking, but how can that happen? Don't we need to tell the computer something? I think you're right, we do need to direct computers a bit. Let's stick with our example where we're trying to predict whether or not it's going to rain. We've been considering cloud coverage and the humidity level as predictors or what we call features of whether or not it's going to rain. If we wanted to use machine learning to solve this, we can still use those features, but we want to let the computer itself determine the relationship between cloud coverage and humidity level and whether or not it's going to rain. Instead of us explicitly programming it's going to rain if the sky is gray and the humidity is high, the computer will use a machine learning algorithm to learn those relationships itself and then predict the chance of rain. This is machine learning. But how does it work? How can computers and algorithms learn these relationships? Well, it's all centered around data and usually a lot of data. In order for our machine learning algorithm to learn how cloud coverage and humidity are related to whether or not it rains, we need to provide the algorithm with this information from past days. This is something that looks like a SQL table or a data frame. There are many different machine learning algorithms that will run those relationships in different ways, but they're all going to use this data. This is what we call training data. Machine learning algorithms use training data to learn those relationships and in result, learn to predict whether or not it's going to rain based on the humidity level and the cloud coverage. This is where the applied statistics comes into machine learning. These algorithms are usually rooted in the mathematical concepts that are prevalent in applied statistics. Just as with hypothesis tests and applied statistics, it's usually best to have as much data as possible when doing machine learning. This gives the machine learning algorithm more information from which to learn. More days of whether to observe, to learn to predict whether or not it's going to rain. This is similar to what humans call wisdom. The more experience somebody has with something, the more wisdom they have, the more they're likely going to know about it. As data gets larger and larger, think billions or even trillions of rows and hundreds of thousands of columns, which again, we call features. The algorithms need to be implemented in more computationally efficient ways. This, along with the programming to direct this process, is the computer science of machine learning. What are the advantages of using machine learning to determine whether or not it's going to rain? You might be thinking, I know whether or not it's going to rain in the morning, why do I need something else to tell me? That's a good question. As humans, we can reasonably tell whether or not it's going to rain most of the time, but not all of the time. It's reasonable that the machine learning algorithm will be better at predicting this than us when we give it enough information to learn from. So we'll likely get more accurate predictions. This is really driven by the algorithm's ability to learn from all of the different days of data. It can record each individual day of weather, learn from each of them and combine the learnings from each of them to create a robust prediction model. We, as humans, we're going to have trouble comprehending all of those learnings ourselves. We're going to have trouble remembering each day and even if we remember them subconsciously, we're not going to remember them as well as a computer will. We might recall certain notable days, but we won't be able to value each individual day and that's a key point about the value of using machine learning algorithms. In addition, we can provide the algorithm additional features or inputs the model we'll use to make predictions like whether or not it rained the day before, the dew-point, the types of clouds that are in the sky, the temperature, the season, and so on. This additional information is really helpful to machine learning algorithms because it provides more information to make predictions. But as this number features grows, it can be hard for human minds to comprehend all of this in real time and use it to make a prediction themselves. Finally, using machine learning can increase the accuracy of our predictions but it can also reduce the labor required of humans. Think about it this way. Do you look outside in the morning every day to determine whether or not it's going to rain? I know that I don't. Even if you do, you likely don't spend much time making your prediction. You might look at a weather app, receive a notification, see something on the news, or hear about it on the radio. Instead, we rely on machine learning algorithms developed by experts to predict the probability of rain. This is a major advantage that applies in nearly every scenario in which machine learning algorithms are applied. Remember, machine learning is the process of algorithms rooted in applied statistics and computer science, learning from the data provided to them. This can open up a whole new set of real-world problems that can be solved using data science. In the next video, we'll start to categorize these problems into different classes of machine learning. Stay with us.