Okay. So the, in the previous slide our assumptions depended on having a large enough sample for the central limit theorem to be applicable. we can actually do an exact binomial test. Which, by exact I mean The calculation utilizes the binomial distribution rather than the asymptotic distribution of the normalized sample proportion. So in this case let's, the, the, the event of getting so we observed 11 people with side effects In the sample, we're testing greater than, that our sample portion is greater than something else. So the, the probability of, of getting of getting evidence as or more extreme than we obtained would be the property of getting more people than, than, than we observed, which is 11. Side effects. So this is the probability of X A, which is the count of the number of the people with, with side effects for drug A, being bigger or equal to 11. That's the sum from 11 to 20 of the binomial proportion. And then, you, you might ask, where does the null hypothesis come in? Where does the, the, the fact that we're, we're under the null hypothesis that. P is P nought is 0.1 well right here so it's 0.1 to the x, 0.9 which is 1 minus 0.1 to the 20 minus x so in other words, this calculation the probability of getting more than 11 people with side effects out of 20 is done under the null hypothesis that P nought is 10%. So this is the probability of getting evidence as or more extreme in favor of the alternative with the probability being calculated under the null hypothesis. So this P value, if you do this calculation the probability of getting 11 or more out of 20 with a null [INAUDIBLE] with a probability of point 1, if you do that calculation the probability is around Zero. there, there's very little contribution of these the, these numbers are quite small. And you can do this in R very easily. Just pbinom 10 20, 0.1, lower.tail equals FALSE. Now, I, I just want to point out this, this small little detail here. If, if we did pbinom. And didn't have this lower.tail equals false if we had lower.tail equals true or omitted it because the default value is true then it's going to calculate 10 the probability of 10 plus 9 plus 8 plus 7 plus 6 plus 5 plus 4 plus 3 plus 1 plus zero. If we do pbinom lower.tail equals FALSE. In other words, it wants to calculate the, the greater than probability, it does the strictly greater than. So if you do lower.tail equals TRUE, it does less than or equal to, so includes 10. If you do greater than, if you do lower.tail equals FALSE, hence you want greater than. It does, it does as strictly greater than, so it starts with 11. If you put pbinom 10, 20, 0.1, lower.tail is equal to FALSE. It's going to calculate the probably of 11 plus 12 plus 13 plus 14 plus 15 plus 16 plus 17 plus 18 plus 19 plus 20, Okay? So in other words, pbinom 10, 20, 0.1, lower.tail = true, that number. Plus pbinom 10, 20, 0.1, lower.tail equals FALSE. Those two numbers add up to 1. The 10 is only included in the instance where lower.tail gives true in the instance where lower.true is false it starts it goes above 10 to 11 and, and, and higher. Anyway just small point, but you get the wrong answer if you don't do that. That. And then if you, if you want to avoid this discussion, you could just do binom.test to say, well we had 11 successes out of 20 trials and we want to test the hypothesis that it's 0.1 and I want my alternative to be greater than binom.test does it. Binom.test is maybe a little bit nicer to use because it actually it, it actually does the, gives you the exact confidence interval as well. Okay. So, so one of the reasons this, this test is called exact is that the. Unlike the asymptotic error rates where the alpha that we used to get the normal quantile is an approximate error rate for the test. So if you do, if you perform a 5% level asymptotic test the, the, the alpha level is not necessarily 5%. It might, might, and there's been work to show that in some cases it can be substantially higher than 5%. on the other hand, this exact test. This guarantees that the alpha level is. If you, let's say you, you pick alpha equal to 0.05. It guarantees that the alpha level is 5% or lower. The problem that, problem being, or lower. Is that its exact but conservative. So it's, it's 5% or lower. So in some cases with very small sample sizes the exact level can be much less than the observed level than the desired level. and then you know, so for two sided test, what, what I'm going to suggest is calculate the two one sided p values. It should be obvious which one is going to be the smaller one. and, and then double it and that just kind of follows our rule we've been using in normal tests and this is good enough. There are maybe slightly better procedures but they change the numbers only a little bit. So, given that we can do a two sided test either by this way or maybe by a better ways. we could calculate every value of P naught, let's say, by a grid search, for which we would fail to reject a null hypothesis in our two-sided test, and that would yield a confidence interval, and that confidence interval would have an exact coverage rate, so it would have coverage, if you did a 95%, 5% test. It would have coverage 95% or higher that's so all these things are slightly conservative so it would be 95% maybe much higher maybe 97% if it's, if it's a very small sample size. And this interval's given a name it's called the Clopper/Pearson interval but it, it the benefit of it is it gurantees your coverage rate. You get coverage 95%. Or better or, or higher. The problem is, is that, in the event that it's or higher, you've unnecessarily, potentially unnecessarily widened the interval, right? because if you want higher coverage, you're generally going to get a wider interval, so there's no such thing as a free lunch. And this and exact intervals fall under that category just as well as everything else in that they do guarantee your error rates but then they have this tendency to be conservative. Wider intervals being a little bit less likely to reject a null hypothesis, are the consequences of using that. On the other hand you do get the assurance that the error rate is exactly adhered to given your assumptions. I just wanted to show a picture from American status [UNKNOWN] paper that I was involved in based on earlier work by Agresti and Brant Coull. And here they what we did is we compared the coverage rate of the wald interval versus the approximate wald interval obtained by using the, inverting the score test and just simply plugging in two. Rather than 1.96. And what you see in these top lines is that, the, the, the, the jagged, they're, they're jagged because of the discreteness of the binomial distribution. and, and here I, we show, let's, let's just look at this top row for 95%, OK? what you can see is that this solid line, the, the, the the aggrestic cool interval it hovers right around 95%. Sometimes its a little low, sometimes its a little high but it still hovers right around. The wild interval can be quite a bit off. Now this is a small sample size so there's no reason to believe the asymptotics have kicked in and done very well. But, you know, if you get up to a say a sample of size 20, the, the closer, the true value of p is to zero and one. You could get a coverage that is very low. [INAUDIBLE] Here I think it was truncated at 70%, it dips down and touches zero at zero and touches zero at one so. The point being that, that, you know, switching away from this Wald interval, where you put p, in, in, for the confidence variable, where you put p hat in the standard error calculation, to the Agresti pool interval, where its a simple fix. really improves performance quite a bit. At, at, no conceptual or computational cost. So that's just the point I'm trying to, trying to make here, and it relates to this discussion of, of, you know, it's basically saying that the score intervals pro, performing a lot better than the Wald interval in this case and that tends to be a gen, a general rule.