Since you're already engineered the input features for the bert model from the bra product reviews data. You are ready to fine tune the hugging face RobertaModel to create a custom text classifier that classifies the product reviews into the three sentiment classes. This is an example of how you train a custom model with amazon SageMaker. In course one, you learned how to train our product reviews text classifier using the building SageMaker blazing text algorithm. This time I show you how to train or fine tune the text classifier with a custom model code for the pre trained bert model you pull from the hugging face model hub. This option is also called bring your on script or a script mode in SageMaker. This option requires a little bit more effort but gives you increased levels of customization in return. To start training a model in SageMaker, you create a training job. The training job includes the following information. The URL of the amazon simple storage service or amazon S3 bucket where you have stored the training data. The compute resources that you want SageMaker to use for the model training. Compute resources are ml compute instances that are managed by SageMaker. The URL of the S3 Bucket where you want to store the output of the training job. The Amazon elastic container registry or Amazon ECR path, where the training code image is stored. SageMaker provides built in docker images that include deep learning framework libraries and other dependencies needed for model training and inference. Using script mode, you can leverage these pre built images for many popular frameworks, including TensorFlow, pyTorch, and Mxnet. After you create the training job, SageMaker launches the ml compute instances and uses the training code and the training data set to train the model. It saves the resulting model artifacts and other outputs in the S3 bucket you specify for the purpose. Here are the steps you need to perform, first you need to configure the training validation and test data set. You also need to specify which evaluation metrics to capture, for example the validation loss and validation accuracy. Next you need to configure the model type of parameters such as number of epics, learning raid etc. Then you need to write and provide the custom model training script used to fit the model. Let's discuss each step in more detail starting with the data set and the evaluation metrics. You can use SageMaker training input class to configure a data input flow for the training. The example code here shows how to configure training input objects to use the training validation and test data splits uploaded to an S3 bucket. If you write your custom model training code, make sure the algorithm code calculates and amidst model metrics such as validation loss and validation accuracy. You can then define riddick's expressions as shown here to capture the values of these metrics from the Amazon cloudwatch locks. Next configure the models hyper parameters. Model hyper parameters include, for example, number of epics, the learning rate, batch sizes for training, and validation data, and more. One important type of parameter for bert models is the maximum sequence length. You didn't have to specify this parameter before if you remember as both SageMaker out of politics and the blazing text algorithm use different and will P algorithms with different type of parameters. As a quick reminder, the maximum sequence length refers to the maximum number of input tokens you can pass to the bert model per sample. I choose the value of 128 because the word distribution of the reviews showed that one 100% of the reviews in the training data said have 115 words or less. Next provide your custom training script. You can start from some example code and then customize it to your needs. And the following code example, you can see an extract from the python model training script called trained on pie. First you import the hugging phase, transform a library. Remember, you can install the library with pip install transformers. Hugging face provides pretrained RobertaModel for sequence classification that already pre configured roberta for tax classification tasks, let's download the model conflict for this RobertaNodel. You can do this by calling RobertaConfig from pre-trained and simply provide the model name in this example, roberta-base. You can then customize the configuration by specifying the number of labels for the classifier. You can set non-labels to three representing the three sentiment classes. The ID to label and label to ID parameters. Let you map the zero based index to the actual class, label of -1 for the negative class. The label of 0 for the neutral class and the label of 1 for the positive class. You then download the pretrained RobertaModel from the hugging face library with the command RobertaForSequenceClassification. From pre-trained providing the model name and the customized configuration. With a pre-trained model at hand, you need to write the code to fine-tune the model here called train model. Here is the code extract that shows how to fine-tune the model using pyTorch in the train model function. You define the loss function and the optimizer to use in this example, I'm using the CrossEntropyLoss function and the Adam optimizer for the model. Then you write the training code, looping through the number of epics and training steps. You read in the training data from the pyTorch data logger, put the model into your trading mode. Clear gradients from the previous step, past the training sample, retrieve the model prediction, calculate the loss, and compute the gradients via back propagation. Finally, you update the parameters with the optimizer step and repeat the loop through all specified training steps and number of epics. Make sure to also include code which runs a validation loop after each epic not shown here that calculates and amidst the evaluation metrics you want to captur. You can see a full example in this week's let assignment, with all configurations done and the model training code ready, you can now fit the model. Create a SageMaker pyTorchEstimator as shown here. You specify the location of the model training script, as the entry point and set the source directory, choose a local directory containing your model training script. And any additional dependencies listed in a requirements .txt file. In the estimator, You can also specify the AWS instance types and instance count to run the model training. You can also define the pyTorch frame work version you're using. This will instruct SageMaker to use the right framework training image to run your code. You pass that you find type of parameters and model evaluation metrics to capture. And then finally, you call estimator.fit to start the fine tuning of the model.