In this and the next lesson, we will work on inference for proportion using this same data set. We will use a frequentist approach in this lesson and a basian approach to address the same question in the next lesson. Our study addressed the question of whether the controversial drug RU-486 could be an effective morning after contraceptive. The study participants were 40 women who came to a health clinic asking for emergency contraception. Investigators randomly assigned 20 women to receive RU-486, and the remaining 20 to receive standard therapy, consisting of high doses of the sex hormones estrogen and a synthetic version of progesterone. Of the women assigned to RU-486, the treatment group, 4 became pregnant. Of the women who received the standard therapy, the control group, 16 became pregnant. The question we want to answer is, how strongly do these data indicate that the treatment is more effective than the control? Here's the framework. To simplify matters, let's turn this problem of comparing two proportions to a one proportion problem. Consider the 20 total pregnancies, and ask how likely is it that 4 pregnancies come from the treatment group. If the treatment and control are equally effective and the sample sizes for the two groups are the same, then the probability that a pregnancy came from the treatment group is simply 0.5. In the frequentist approach, we first need to set our hypotheses. But before that, let's define the parameter of interest we will use in these hypotheses. Let's let p be the probability that a given pregnancy comes from the treatment group. Then the null hypothesis is that p is equal to 0.5, which says that there is no difference between the treatment and control groups. And the pregnancy is equally likely to come from either the treatment or the control group. The alternative hypothesis is that p is less than 0.5, which says that the treatment is more effective and a pregnancy is less likely to come from the treatment group. To make a decision, within the frequentist paradigm, we need a p-value. Remember from earlier courses in the specialization that a p-value is the probability of an observed or more extreme outcome given that the null hypothesis is true. And when we say more extreme, we mean more extreme in the direction of the alternate hypothesis. The outcome in this experiment is 4 successes in 20 trials. The null hypothesis states that the probability of success is .5, then we can calculate the p-value as obtaining 4 or fewer successes in 20 trials where the probability of success is 0.5. This probability can be calculated exactly with a binomial distribution. Remember also from earlier courses in this specialization that the number of successes in a fixed number of independent trials for a categorical random variable with two levels that can be defined as a success or a failure, follows a binomial distribution, with two parameters, n and p. In this case n is 20 and p is 0.5. And we're looking for 4 or fewer successes, which can be defined as the probability that k is at most 4. Using r, we can calculate this probability as 0.0059. This means, that the chances of observing 4 or fewer pregnancies in the treatment group, given that pregnancy was equally likely in the two groups is approximately 0.0059. With such a small probability, we would reject the null hypothesis and conclude that the data provide convincing evidence for the treatment being more effective than the control.