In this section, you'll learn about production variants and how they can be used to implement advanced deployment options for your real-time endpoints hosted using Amazon SageMaker hosting. For this, I'll focus on using production variants, which you'll have the chance to explore further in this week's labs. Production variants can be used for A/B testing or canary testing. But for this week's labs, you will use them specifically for A/B testing. If you remember, A/B testing is essentially splitting traffic across larger groups for a period of time to measure and compare the performance of different model versions. To do this with SageMaker, you need to be familiar with the concept of production variants. Up until now, the examples that I've shown have mostly shown a single model behind an endpoint. That single model is one production variant. A production variant is a package SageMaker model combined with the configuration that defines how that model will be hosted. You may recall that the SageMaker model includes information such as the S3 location of that trained model artifact, the container image that can be used for inference with that model, and the service run-time role and the model's name. The hosting resources configuration includes information about how you want that model to be hosted. This includes things like the number and the type of machine learning instances, a pointer to the SageMaker packaged model, as well as the variant name and variant weight. A single SageMaker endpoint can actually include multiple production variants. Production variants can be used for both canary testing and A/B testing, as well as some of the other deployment strategies that I discussed earlier. Remember that a canary deployment includes canary groups, which are smaller subsets of users that are directed to a specific model version to gauge the performance of that model version on a specific group of users. In this picture, you see a SageMaker endpoint that's configured with two variants, variant A and variant B. Variant A has been configured so that 95 percent of the traffic will continue to be served by that model version, but only five percent of traffic will be served by variant B. The client application here controls this traffic on which users will be exposed to which variant. This is done programmatically by specifying the target variant when the client application invokes that target endpoint. This is one way to use production variants for canary roll-outs. Let's now look at another way to use production variants for A/B testing. This is what you'll also be doing in your lab for this week. In this case, you still have two variants, variant A and variant B. However, in this case, you're splitting traffic equally. 50 percent of traffic is being served by variant A, and 50 percent is being served by variant B. So in this case, the client application just invokes the SageMaker endpoint, and traffic is automatically routed based on that variant weight. However, you could also configure it to use the client application to route your 50 percent traffic to specific users as well. A/B testing is similar to what you would do for a canary roll-out, but you'd be doing it for a larger user group and typically running both model versions for a longer period of time while you compare the results of those different model versions. I'll now cover how you can set up production variants for A/B testing between two model versions. In this case, you'll learn how to use production variants for the SageMaker option where you're using a pre-built container image. In this first step, you construct the URI for the pre-built Docker container image. For this, you're using a SageMaker provided function to generate the URI of the Amazon Elastic Container Registry image that'll be used for hosting. The next step includes creating two model objects which packages each of your trained models for deployment to a SageMaker endpoint. To create the model packages, you'll use the URI information from the previous step and supply a few other items for packaging, such as the location of your trained model artifact is stored in Amazon S3, the AWS identity and access management or the IAM role that will be used by the inference code to access AWS resources. Next, you configure the production variants that will be used when you create your endpoint. Each variant points to one of the previously configured model packages, and it also includes the hosting resources configuration. You can see in this case, you're indicating that you want 50 percent of your traffic sent to model variant A and 50 percent of your traffic sent to model variant B. Recall that the model package, combined with the hosting resources configuration, make up a single production variant. Now that you've configured your production variants, you now need to configure the endpoint to use these two variants. In this step, you create the endpoint configuration by specifying the name and pointing to the two production variants that you just configured. The endpoint configuration tells SageMaker how you want to host those models. Finally, you create the endpoint, which uses your endpoint configuration to create a new endpoint with two models or two production variants. In this section, I covered the concept of production variants and how you can use production variants to implement more advanced deployment strategies like A/B testing and canary roll-outs. In this week's lab, you'll get hands-on experience with creating production variants to perform A/B testing on your endpoints.