Hi everyone. We're going to be talking about graphics for power and sample size in this lecture. We want to understand power curves, how to choose a graphic that tells the desired story, and look into different inputs that affect power. Here's a power curve, which we've seen before in this course. It's got power on the y-axis, mean difference on the x-axis. This is one of the reference lines at 0.8. The reference line shows us that the power curve crosses that line right around the mean difference of 0.4. Power is often graphed against mean difference because we use mean difference to describe the null and alternative hypotheses. When the mean difference is not equal to zero, like 0.2 on our graph, an alternative hypothesis applies. The x-axis on the power curve shows the many possible alternative hypotheses, revealing our uncertainty as to what the correct alternative actually is. For example, we talked about 0.2 in the last slide. A mean difference of 0.8 is another example. When the null is true, power is equal to the Type 1 error rate. In this case, it's 0.05. Just by looking at power curves, you can see that it tends to flatten out as the mean difference continues to increase. This is why we like to design for the flat part of the curve, shooting for high power goals. Remember, it flattens out at about 0.8 or 0.9. We suggest to aim for high values on the flat part of the curve so that if the true value turns out to be 0.8 and you'd mistakenly specified it at 0.6, you're still going to have the adequate power. Type 1 error rate and power have a direct relationship. This means that as one decreases, the other decreases as well. You can see this idea represented by the orange and blue colors here. Where as the orange or the Type 1 error rate decreases and moves closer to the end of the curve, the blue power is also decreased. A lower Type 1 error rate also requires a larger sample size if you want to achieve the same power. Now, we will talk about using graphics to get information across. The idea we want to understand here is that power curves should have a purpose. There should be at least one idea conveyed from each graph. We'll show you some examples so you know exactly what we mean. Our graph will be based on a study we've talked about before. The group randomized trial in which workplaces we're either in a treatment group, where they took part in a training program that encouraged reducing alcohol consumption, or workplaces in a control group where there was no training of any kind. The outcome that was measured for each worker was the drinking rate. Here's the flow chart of the study for your reference. The null hypothesis is that there's no difference in post-treatment drinking rates between workers in the two groups. Here are more characteristics of the study as you may remember. The workplace was our independent sampling unit and the drinking rate was the unit of observation. There was a between factor, which is treatment or intervention, and a within factor cluster membership inducing correlation within the workplaces. There was a between factor, which is treatment or intervention and a within factor of cluster membership inducing correlation within the workplaces. We looked at the post-drinking frequency between the two groups. Hoping that the group that took part in the drinking educational training program, which will be represented by the second bar in this graph, showed reduced drinking. Remember that the drinking rates of workers within each workplace are going to be correlated. This is due to their shared experiences in the same workplace. The researchers were able to make two important assumptions as a result of the group randomization. Remember that these assumptions are that the correlation between workers in the group is the same as the correlation between workers in another group, and that any pre-existing factors did not influence or bias the study. So, the question we must ask ourselves is what is the best way to present the power analysis? The idea or ideas we were trying to convey that can be represented through graphs. Let's explore a few questions and graphs that would go along with them. One question you can explore is how power and intracluster correlation, or the correlation within a cluster are related? We can use a graph to show how power changes based on different intracluster correlation values. Take the example described here. We have 20 workplaces in each group with 10 workers in each workplace and a Type 1 error rate in a standard deviation. Based on all of this, how will power be affected by intracluster correlation? We get our answer by looking at the graph. As you can see, we have separate power curves for different values of intracluster correlations. This graph answers our question. The power decreases as intracluster correlation increases. Meaning, they have an inverse relationship. This is why we keep telling you to be aware of who's correlated to whom as it applies and is a big difference in power. Hopefully, you will see how the graph conveyed this message pretty strongly. Let's look at another question. If we're given a cluster size, what is the effect of the intracluster correlation on power? In this situation, we have five workplaces in each group. Standard deviation of one, a set mean of 0.75 in a cluster of 10 with a Type 1 error rate of 0.05. We will just starting with different inputs so we can get different information. Here, we are looking at the intracluster correlation from zero to 0.1. Also, this time, the power is graphed against cluster, which is why we get a different looking curve. You can see that as the cluster size goes up, the power goes up. That makes sense because the bigger the sample size, the more power we have. But look at what's happening with the different intracluster correlation coefficients, the highest power is associated with intracluster correlation coefficient of zero as you would expect. We expect this because if you have independent people, they provide more information than clusters of correlated people. As the clusters of people become more related to each other, they provide less information, and the power drops. So, the more highly correlated people are with one another, they provide less information and the power goes down. Once again, an inverse relationship between power and intracluster correlation. Next question. What about cluster size and power? Do you want to pick a larger cluster or a smaller cluster? Here, you can review the details, but let's move on to the graph to find out. Here, once again, we see power graphed against cluster size. Power increases as cluster size increases. This is a direct relationship. As the cluster size go up, the power increases. Let's move on to standard deviation. What does very variance do to your power? Check out this graph. As variance increases, power decreases. There's an inverse relationship between power and variance. One way to think about this is that variance is essentially noise. The more noise there is, the less clear things are, hence less power. This time we're going to look at cluster size we should choose given mean difference. Given the mean different values, what cluster size should we choose? This graph tells us the cluster size of 50, the largest cluster size shown, has the most power. Which makes sense because the sample size has gone up. It's not doing me that much good though because you could see that a cluster size of 25 and even 10 is also pretty close. This time, we're looking at how a Type 1 error rate affects power given mean difference. Here we see three types of Type 1 error rates. It shows that the power increases as Type 1 error rate increases, telling us that the two have a direct positive relationship. The final question we will look at and try to answer with a graph, is there a big difference in power between two designs with different Type 1 error rates? Let's take a look. Here we're looking at a Type 1 error rates of 0.01 and 0.05, which are both reasonable. As you can see here, there's a horizontal line drawn at power of 0.95, and you can see that the vertical line showing the mean difference. The lower curve is the Type 1 error of 0.01. So, we could see a slightly larger mean difference of 0.6 versus the mean difference of 0.5, which we get if our Type 1 error rate was equal to 0.05. It really does not make much of a difference. Let's go through what we learned from these graphs. Power and intracluster correlation have an inverse relationship. As one increases, the other decreases. Power and cluster size have a direct relationship, they both move in the same direction. Power and standard deviation have an inverse relationship. Power and Type 1 error rate have a direct relationship. Let's do a quick review. Power curves are often graphed along mean difference values because it is how we describe the null and alternative hypotheses. We use these power curves to tell stories and convey ideas that we want to let others know about. Here at the bottom, you can see the relationships between the different design inputs and power. That's it for this lecture. Thank you for your time.