In this module, we're going to pick up where we left off in the last module and look at sampling algorithms. First step, we're going to look up Gibbs sampling. In a Gibbs sampler, the proposal distribution matches the posterior conditional distribution and as a result, the proposals are always accepted. This is because there's no reason to reject, unlike in the Metropolis algorithm where an arbitrary proposal distribution is used. This can be seen as a specific case of a Metropolis algorithm. One of the features of the Gibbs sampler is that it allows us to perform inference on more than one parameter at a time. This is done by drawing one parameter at a time conditional on the values of the other parameters. It iteratively works with the parameters using this process and continues till sufficient samples have been drawn for all parameters. Additionally, Gibbs sampling can draw proposals from an asymmetric distribution. In the example below, we will be drawing from a Gamma distribution which is not symmetric. Not having a predetermined proposal distribution is sometimes seen as an advantage. The disadvantage of this method however, is that you're required to decompose the joint distribution into the conditional distributions in order to sample from them. However, if the conjugate solutions are known, the Gibbs sampler can be faster than the Metropolis-Hastings algorithm. In the following example, we're going to try to infer the parameters of a normal distribution. That is the mean given by Mu and the precision given by Tau. Keep in mind we're using the Mu and Tau parameterization as opposed to the Mu mean and the standard deviation parameterization for a normal distribution here. We use a normal distribution here, for example, since we can use that to illustrate how Gibbs sampling can be used to estimate multiple parameters, that is Mu and Tau at the same time. Now, let us look at the problem setup we're going to use to illustrate Gibbs sampling. We're going to use Gibbs sampling to estimate the parameters of a model that is used to represent some phenomena. For the sake of this exercise, let us say that this is a normal distribution and we have one data point as our observation. Since we're using a normal distribution which is parametarized using Mu and Tau, we will need the conjugate solution for computing our posterior from the priors. Here, as I already mentioned, Mu is the mean and Tau is the precision of that normal distribution. A word on notation as we proceed, if we use Tau as an example, the draws are denoted by numbered subscripts such is Tau_0, Tau_1, and so on, while the hyperparameters for the prior and posterior distributions are denoted as Tau_prior and Tau_posterior respectively. Now a parameter Mu for a normal distribution is drawn from a normal prior distribution that's parametarized by the hyperparameters that is Mu_prior and Tau_prior. Note that these are the parameters, the prior distribution. Here we select Mu_prior to be 12 and Tau_prior to be 0.065, which corresponds to a Sigma or a standard deviation of 4 and for the Tau or the precision parameter, we draw that from a Gamma prior distribution that is parametarized by the hyperparameter values of Alpha_prior and Beta_prior and for those we select the shape parameter Alpha_prior to be 25 and the rate parameter Beta_prior to be 0.5. In order to perform Gibbs sampling, we need to have the conjugate solutions for both parameters Mu and Tau. We have a conjugate solution for parameter Mu with a normal prior as shown here, and a conjugate solution for parameter Tau with a Gamma prior, as shown by these equations here. A couple things to know here, here n is the total number of data points, in our example, this is set to one, so n equals one, which means the summation is simply the value x, or it's a scalar value for x. Tau_0 and Mu_1 are draws or samples drawn from the posterior distributions. We're assuming that we're starting with Tau or we're drawing a value for Tau from the prior distribution for Tau and then we proceed to compute the posterior distribution for Mu from it. Then in our next iteration, we sample a value for Mu from that posterior distribution and then update the posterior for Tau here. We'll look at this process in more detail below.