0:42

You're being asked to make a decision, and there are associated payoffs and

losses that you should consider.

We'll summarize the payoff/loss information in a decision table.

Remember, the true state of the population can ether be 10% yellow M&Ms or

20% yellow M&Ms.

Say that you decide that the percentage of yellow M&Ms is indeed 10% and

that is the true state of the population.

Then you made the correct choice and your boss gives you a bonus.

Say on the other hand, you decided that the true proportion of yellow M&Ms is 10%,

but the true state of the population is 20%.

You made the wrong choice, and you loose your job.

You might also decide that percentage of yellow M&Ms is 20% and

if that is your decision and

the true percentage of yellow M&Ms is 10%, once again you've made a mistake.

So you'd loose your job.

1:38

Or if you decide that the true proportion is 20% and that is indeed the case,

you make the right decision and your boss gives you a bonus.

Obviously, you're going to be making this decision using data.

You can buy a random sample of M&Ms from the population.

Your data, as I said, will be your random sample of M&Ms from the population.

Each M&M is going to cost you $200 which is indeed pretty steep, but

remember that data collection can be pretty costly.

You pay $200 for each M&M and you must buy in $1,000 dollar increments.

That is 5 M&Ms at a time.

You have a total of $4,000 dollars to spend so

you may buy 5, 10, 15, or 20 M&Ms.

What is the cost?

Or the benefit of buying fewer or more M&Ms.

The benefit obviously is that as you increase your

sample size your decisions are going to be more reliable and remember that the cost

of making a wrong decision is pretty high, you could lose your job.

So you want to be fairly confident of your decision.

At the same time, though, the data collection is costly as well.

So you don't want to pay for a sample larger than you need.

If you believe that you could actually make a correct decision using a smaller

sample size, you might choose to do so and save money and resources.

3:13

Let's start with the frequentist method.

Our null hypothesis is that the proportion of yellow M&Ms is 10%.

Remember the two choices were 10% or 20% within the frequentist framework

since we cannot set the parameter equal to a value in the alternative hypothesis,

we define that alternative as p is greater than 10%.

That's closer to the 20.

We also need to decide what our decision threshold.

In other words, the significance level should be.

A significance level of 5% is customary to use in literature, and in practice.

But there may be very good reasons for using a different significance level.

Remember from earlier courses in this specialization that the significance level

is a probability of a Type I error.

That is the probability of rejecting the null hypothesis

when the null is actually true.

So it makes sense to keep this rate as low as possible.

However, at the same time,

there may be benefits to using a slightly higher significance rate.

As this would mean that we would be reducing our Type 2 error rate.

That is the probability of failing to reject the null hypothesis

when it is actually false.

If the P value we calculate ends up being smaller than our significance level,

we reject our null hypothesis in favor of the alternative and

conclude that the data provide convincing evidence for the alternative hypothesis.

We mentioned that we would be working with a sample size of five, so n is five, and

since there's only one yellow M&M in the sample, k is equal to one.

Our test statistic is the number of yellow M&M's in this sample.

Remember that the P value is the probability of observed or

more extreme outcome given that the null hypothesis is true.

In context, this is the probability of one or more yellow M&M's in a random sample of

five M&M's assuming that the true proportion of yellow M&Ms is 0.10,

we can calculate this probability as the compliment of no successes in five trials.

Let's pause for a moment and think about why this is the case.

In a sample space with five trials, you could have zero successes, one success,

two successes, three successes, four successes or five successes.

If your interested in the number of successes being greater than or

equal to one that means that the only outcome that you're not interested

in is the number of successes being equal to zero.

Hence, the two probabilities, the probability of at least one, and

the probability of none are compliments of each other.

The probability of no successes in five trials with a probability of success for

each trial is 0.1 Is 0.90 to the 5th power.

So the overall probability of at least one success, comes out to be 0.41.

With such a high P value, we would fail to reject the null hypothesis and

conclude that the data do not provide convincing evidence

that the proportion of yellow M&M's is greater than 10%.

This means that if we had to pick between 10% and 20% for

the proportion of M&M's, even though this hypothesis testing procedure does not

actually confirm the null hypothesis, we would likely stick with 10% since we

couldn't find evidence that the proportion of yellow M&M's is greater than 10%.

Next we'll try to answer the same question using a Bayesian approach.

Once again we start with our hypotheses.

The first hypothesis is that the proportion of yellow M&Ms is 10%.

And the second hypothesis is that the proportion is 20%.

Note that in the Bayesian method we can actually evaluate

the probabilities of both these models we're considering

as opposed to having to choose one of as our null.

And tailor our alternative hypothesis around that.

We also need to place prior probabilities on these hypotheses.

I really don't have a reason to believe one is more likely than the other so

I'm going to place a 0.5 probability on each one.

We're still working with the same data set of 5 M&Ms,

where one is yellow The next step is to calculate the likelihood of this outcome.

One success in five trials, under the two models,

the two hypotheses that we're considering.

We can use the binomial distribution to calculate these probabilities.

The probability of one success in five trials,

where p is equal to 0.10, is roughly 0.33.

8:45

Here we summarize what the results would look like

if we had chosen larger sample sizes as well.

That is, if we had a sample size of 10 with 2 yellow, or

15 with 3 yellows, or 20 with 4 yellow M&Ms.

Under each of these scenarios,

the frequentist method yields a higher P value than our significance level,

so we would fail to reject the null hypothesis with any of these samples.

On the other hand, the Bayesian method always yields a higher posterior for

the second model where P is equal to 0.20.

So the decisions that we would make are contradictory to each other.

However, note that if we had set up our framework differently in the frequentist

method and set our null hypothesis to be P is equal to 0.20 and

our alternative to P is less than 0.20, we would obtain different results.

This shows that the frequentist method is highly sensitive to the null hypothesis,

while in the Bayesian method,

our results would be the same regardless of which order we evaluate our models.