As a reminder, AutoML is a capability that applies machine learning and automation to model-building tasks by automatically analyzing your data, automatically selecting the right algorithm, or algorithms, for your experiments, automatically preparing and applying data transformations, and then, finally, automatically training and tuning your model through a number of iterations, until the most performant model is identified. There are a lot of implementations of AutoML, which is called Autopilot. In this section, I'll start with an introduction to SageMaker Autopilot. Autopilot includes the steps I previously covered in your machine learning workflow, including performing data exploration, identifying the machine learning problem, selecting the algorithm based on the machine learning problem and your data, also, transforming the data to get it into the format that is expected by your algorithm, and, finally, training and performing hyperparameter tuning, to find the optimal set of hyperparameters that result in the most performant model. In addition to covering the workflow steps and tasks that I highlighted, Autopilot is also fully transparent, meaning it will automatically generate and share the feature engineering code, generate Jupiter notebooks that walk you through how the models were built-- this includes the data processing, as well as the algorithms, the hyperparameters, and the training configuration. You can then use those automatically generated notebooks to reproduce the same experiment or perform modifications to that example to continue to refine your model. Let's look at how Autopilot works at a high level. First, you need to upload your tabular data set to a bucket in Amazon Simple Storage Service, or Amazon S3, which is an Object Storage Service on AWS. That data set must contain your target attribute, which in this example that we've been working with, is your product review sentiment. You tell Autopilot what the target attribute is when creating your Autopilot experiment. Autopilot will then analyze your data, identify the machine learning problem, if you did not predefine it, and then choose the ML algorithm that most closely matches the input data and the target that you're trying to predict. Autopilot support several built-in algorithms that include linear learner, XGBoost, and a deep learning algorithm. The available algorithms cover the problem types of regression, binary classification, and multi-class classification. Autopilot will then generate feature engineering code to transform your data into the format that's expected by the algorithm. After the data analysis step, Autopilot will generate two types of notebooks that describe the plan that Autopilot follows, to create the candidate models. First, there's a data exploration notebook that describes what Autopilot learned about your data. This notebook also identifies areas of investigation, that may indicate potential issues in your source data that could require further analysis or impact your model performance. The second type of notebook is a candidate generation notebook, which contains each suggested data preprocessing step, the algorithm, and the hyper parameter ranges that will be used for the tuning job. I previously talked about the handling of class imbalance. Sagemaker Autopilot can provide models that are more accurate, even when data sets are highly imbalanced, and have as few as 500 data points. You can also direct SageMaker Autopilot to use the area under the curve, or the area under the receiver operating characteristic curve, as the objective metric, to create even more accurate models for classification problems. When you set up Autopilot, you can use it in different ways. You can choose to use it completely on autopilot, as the name suggests, which will automatically flow through the next steps that you see here, starting with running the data transformations, using the generated feature transformation code. Optionally, you can choose to run it with different degrees of human intervention as well. As an example, you can set up Autopilot, so that it does not automatically run the feature engineering code in the data analysis step. And this is good, if you want to be able to first look at the candidate notebooks and then decide which ones to train and tune. Otherwise, to use it in true autopilot mode, you can let Autopilot automatically run through the data transformation pipeline, perform the model training and tuning for each of the candidates, and then provide you with the best candidate, based on the experiments it runs. Finally, Autopilot shares a leaderboard of all the candidate models, including their metrics, with a ranked list of recommendations, to determine the best-performing model. Autopilot also provides that complete visibility into the feature engineering code, the algorithm, and the optimized hyperparameters that were used, which is designed to give you that control and transparency that I discussed earlier. Let's take a look at Autopilot in action, in the context of the product review use case.