Now it's time for the exciting topic. How do you do machine learning and BigQuery? Before we go into the syntax of model building with SQL. Let's discuss very quickly how BigQuery ML came about. As you saw from the earlier ML timeline doing machine learning has been around for a while, but the typical barriers have been number one, doing ML and small datasets in Excel or Sheets and iterating back and forth between new BigQuery exports. Or if you're fortunate to have a data science team in your organization, number two is building time-intensive tensor flow or psychic learn models using experts time and still you're just using a sample of data that the data scientists can train and evaluate the model locally on the machine if they're not using the cloud. Google saw these two critical barriers getting data scientists and moving data in and out of BigQuery as an opportunity to bring machine learning right into the hands of analysts like you who are already really familiar with manipulating and preprocessing data. You soon will be by the end of this specialization. Okay, so here we go. Let's talk about how you can now do machine learning inside a BigQuery using just SQL. With BigQuery ML you can use SQL for machine learning. And as to repeat that point SQL, No java or python code needed just basic SQL to invoke powerful ML models, right where data already lives inside of BigQuery. Lastly, the team has hidden a lot of the model knobs like hyperparameter tuning or common ML practitioner tasks like manual one, hot encoding of categorical features from you. Now those options are there if you want to look under the hood but for simplicity, the models will run just fine with minimal SQL code. Here's an example that you'll become very familiar with in your next lab. You notice anything strange about the number of GCB products used to do ML here, you got it. It's all done right within BigQuery, data ingestion, preprocessing with SQL, model training, model evaluation, the predictions from your model, and the output into reporting tables for visualization. As we mentioned before BigQuery ML was designed with simplicity in mind but if you already know a bit about ML you can tune in and adjust your models hyperparameters like regularization, the dataset splitting method, and even the learning rates for the models options. We'll take a look at how to do that in just a minute. So what do you get out of the box? First BigQuery ML runs on standard SQL and you can use normal sequence and tax like UDFs, subqueries and joints to create your training data sets. And for model types, currently you can choose either from a linear regression model for forecasting or binary logistic regression for classification. As part of your model evaluation, you get access to fields like the ROC curve as well as accuracy precision and recall. They can simply select from after a SQL model was trained and if you'd like you can actually inspect the weights of the model and performed feature distribution analysis. Much like normal village visualizations using BigQuery tables and views, you also can connect your favorite BI platform like data studio or looker and visualize your model's performance and its predictions. Now the entire process is going to look like this. First and foremost we need to bring our data into BigQuery if it isn't there already, that's the ETL. And here again, you can enrich your existing data where else with other data sources that you ingest and join together using simple SQL joints. Next is the feature selection and preprocess except, which is very similar to what you've been exploring so far as part of the specialization. And here's where you get to put all of your good SQL skills to the test and creating a great training dataset for your model to learn from. After that, here it is, this is the actual SQL syntax for creating a model inside a BigQuery. It's short enough that I could fit it all within this one box of code. You simply say create model, give it a name, specify mandatory options for the model like the model type passing your SQL query with the training dataset and hit run query, and watch your model run. After your models trained, you'll see that as a new dataset object that will be there inside of your BigQuery dataset and it'll look kind of like a table but it'll perform a little bit differently. because you can do cool things like executed ml.evaluate query and that reserves into actual allow you to evaluate the performance of your trained model against your evaluation dataset because remember you want to train on the different data set that you want to evaluate on. And here you can analyze lost metrics that will be given to you like the root mean squared error for forecasting models and the area under the curve accuracy precision and recall for classification models like the one that you see here. And once you're happy with your model's performance again, you can kind of interest and train multiple models and see which one performs the best, you can then predict with it with this even shorter query that you see here. Just invoke ml.predict and that command on your newly trained model will give you back predictions as well as the models confidence in those predictions. Super useful for classification. You'll notice a new field and the results when you run this query, where you'll see your label field with the word predicted added to the field name, which is simply just your model's prediction for that label. It's that easy. But before we dive into your first lab now that you've seen with just these lines of code that you see here, how easy it is to create a model that doesn't mean that it's going to create a great model. A model is only as good as the data that you feed into it for you to learn the relationship between your features and the label. That's why you're going to spend most of your time exploring, selecting, and engineering good features, so that we can give our model the best possible data set for it to work and learn from.