0:00

In this lesson we work on the same research question on the effectiveness

Â of RU-486 as a morning after pill that we introduced in the previous lesson.

Â However, this time we answer the question using a Bayesian approach.

Â Let's start with a quick reminder of the framework.

Â We had decided on considering only the 20 total pregnancies,

Â four of which occur in the treatment group.

Â And the question we're asking is,

Â how likely is it that four pregnancies occur in the treatment group?

Â Also remember that we had decided that if the treatment and control are equally

Â effective, and the sample sizes for the two groups are the same,

Â then the probability that the pregnancy comes from the treatment group is 0.5.

Â 0:38

Within the Bayesian framework, we will also start by setting our hypotheses, or

Â we can think of these as the models that the data come from.

Â We begin by delineating each of the models we consider plausible.

Â We know that p, the probability that a pregnancy comes from the treatment group

Â can take on any value between zero and one.

Â However, we'll start slow and instead of considering a continuous parameter space

Â for p, we will assume that it is plausible that the chances that a pregnancy comes

Â from the treatment group is 10% or 20% or 30% or 40% or all the way up to 90%.

Â Hence, we're considering nine models,

Â not just one model, as was the case for the classical frequentist paradigm.

Â Let's pause for a second and think about what it means for p to be equal to 20%.

Â This means that given a pregnancy occurs, there is a two to eight,

Â or one to four chance that it will occur in the treatment group.

Â 1:35

Next, we need to specify the prior probabilities we want to

Â assign to these hypotheses.

Â The prior probabilities should reflect

Â our state of belief prior to the current experiment.

Â They should incorporate the information,

Â learn from all relevant research up to the current point in time.

Â However, should not incorporate information from the current experiment.

Â Suppose my prior probabilities for

Â each of the nine models is presented in this table.

Â 2:03

I placed to have a prior at p is equals to 0.5, a prior probability of 52% and

Â equally divided the remaining probability among the other models.

Â This equal distribution implies that the benefit of the treatment is symmetric,

Â that the treatment is equally likely to be better or

Â worse than the standard treatment.

Â And the 52% prior at peak equals 0.5 implies that we believe

Â that there's a 52% chance that there is no difference between the treatments.

Â One natural question that you might have at this point is how

Â did you come up with those priors?

Â We will discuss prior specification in detail later in the course, so for

Â now let's stick with the chosen priors and work through

Â the mechanics of calculating the posterior probabilities and making a decision.

Â Now we're ready to calculate the probability of observed data,

Â given each of the models that we're considering.

Â This probability is called the likelihood.

Â In this example, this is simply the probability of the data, given the model.

Â Which can be written as the probability that k is equal to 4,

Â given that n is equal to 20, and the various values of p we decided to

Â consider as plausible models, 10% through 90%.

Â As we did in the previous video,

Â we can express the probability of a given number of successes

Â in a given number of independent trials with a binomial distribution.

Â We consider a sequence of probabilities of success from 10% to 90%,

Â increasing by 10%, we assign a 52% prior probability to p equals 0.5,

Â and 6% probabilities to all other models.

Â We won't actually use these prior probabilities in the calculation of

Â the likelihood, but they will become relevant for

Â the calculation of the posterior in the next slide.

Â Finally, we can calculate the likelihood as a binomial with four successes and

Â 20 trials, when p is equal to the variety of values we're considering.

Â The results are summarized in this table.

Â The header row lists the models that we're considering, and in the next row,

Â the priors we discussed earlier are shown.

Â The last row of the table lists

Â the likelihood calculated using the binomial distribution.

Â The number of successes and the number of trials are the same for

Â each of these likelihoods, four and 20 respectively.

Â However, each likelihood listed uses a different probability of success

Â based on which model is based on.

Â 4:30

Once the models are delineated and priors are expressed and the data are collected,

Â we can use Bayes' rule to calculate the posterior probability.

Â In other words, the probability of the model given the data.

Â So here's a reexpression of the Bayes' rule for model and data.

Â The probability of model, given data,

Â will be the probability of model and data divided by the probability of the data

Â 5:08

We can once again do all of these calculations in r.

Â The numerator is simply the vector of prior probabilities we defined earlier and

Â the likelihood data given model we calculated.

Â The denominator is simply the sum of the probabilities for

Â the various models in the numerator.

Â This mimics the calculation based on probability trees that we've seen before.

Â Where the denominator sums up all possible probabilities

Â where the data might be coming from.

Â We also check to make sure that the posterior probabilities add up to one,

Â which they do.

Â The posterior probabilities are summarized in this table.

Â We can see that the posterior probability is highest at p is equal to 0.2.

Â So this model is the most likely model, based on the observed data.

Â The posterior probability at p is equal to 0.2 is 42.48%.

Â Even though we had assigned a low prior to this model,

Â the incorporation of the data gave this model a high probability.

Â This shouldn't be surprising, since four successes in 20 trials is basically 20%.

Â So the calculation of the posterior incorporated prior information and

Â likelihood of the data observed and the concept of data,

Â at least as extreme as observed placed no part in the Bayesian paradigm.

Â Finally note that the probability that p is equal to 0.5,

Â dropped from 52% in the prior to about 7.8% in the posterior.

Â This demonstrates how we update our beliefs based on observed data.

Â 7:04

The Bayesian paradigm, unlike the frequentist approach,

Â also allows us to make direct probability statements about our models.

Â For example, we can calculate the probability that RU-486,

Â the treatment, is more effective than the control

Â as the sum of the posteriors of the models where p is less than 0.5.

Â