Let's start by identifying where BigQuery ML fits into the greater picture of Google Cloud's AI and machine learning options. Unlike the ML APIs, you are able to create your own custom models with BigQuery ML. We'll get to AutoML later, but in short AutoML allows us to leverage Google's ML models in certain cases to build your own models from scratch using transfer learning and a form of neural architecture search. To work with BigQuery ML, there are only a few steps from beginning to inference. First you must write a query on data stored in BigQuery to extract your training data, then you can create a model where you specify a model type and other hyperparameters. After the model is trained, you can evaluate the model and verify that it meets your requirements. Finally, you can make predictions using your model on data extracted from BigQuery. Are you familiar with Hacker News? Hacker News is a social news aggregator website focused on computer science and entrepreneurship. I won't get into all of the details, but let's ask a question; given these three articles on the right, where were they actually published? TechCrunch, GitHub or The New York Times? Of course you may be able to tell based on the style of the articles, but what if you wanted to build an ML model to do this. First, let's write an ad hoc query to explore your data. Here's a quick example where you're simply looking at the title of the article and the URL. You can extract the publisher information from the URL using regular expression functions. That's exactly what you'll do using the query on this slide. The plan will be the following: extract the publication source using a RegExp_extract function, and then separate out the first five words of each title, or replace them with null values in the case the title is too short. The goal will then be to decide if the article title is from GitHub, the New York Times, or TechCrunch using BigQuery ML. Let's look at the syntax needed to create a model. You have the create or replace model statement at the top where you give the name of the model, the options of the model, and other hyperparameters. After the AS clause, you have your query which defines the training set from the previous slide. That's it. In this case you're specifying that you want to build a logistic regression model. This is a type of classification model you can use to classify your input as being either a GitHub, New York Times, or TechCrunch article. To get the metrics, you simply call ML.evaluate on your train model or click on the model in the BigQuery UI and click the Evaluation tab. For the metrics, they're on a score from 0-1. Generally speaking, the closer to one the better, but it really depends on your metric. Precision means for the articles you made a guess on, how accurate were those guesses. High precision means a low false positive rate meaning you really punish a model's precision if it makes a ton about guesses. Recall on the other hand is the ratio of correctly predicted positive observations to the all observations in actual class; how many did you get right out of both true positives and false negatives. Accuracy is simply true positives plus true negatives over the entire set of observations. There are other metrics like F1_score, log_loss, and rock_curves that you can use as well. You can also see the confusion matrix where you can see where the model made incorrect predictions. As you can see here, the model had some confusion between TechCrunch and New York Times articles. If the model meets your requirements, great, you're ready to predict with your model. You don't need to worry about deploying your model in a separate process, it's automatically available to serve predictions in BigQuery. Here's an example of performing batch prediction using BigQuery ML. The first example here has government, shutdown, leaves, workers, and reeling as the first five words of the title. The model's prediction, in this case, is the New York Times.