[MUSIC] An important feature of trial monitoring is the inclusion of interim analyses. The motivation behind an intern analyses is the question of whether or not it is ethical to continue a trial when we already know the answer. So efficacy has been proven or disproven. We are unlikely to get an answer which we call futility. A treatment is either unsafe on a relative scale compared to other treatments or an absolute scale compared to a maximum threshold that's allowed which would be stopping for safety. Interim analyses are designed to allow the trial to stop early. We have to balance the knowledge that we could gain by continuing the trial versus the risk to the participant by making them continue. This is really a question of the good for the population in terms of learning more versus an individual who is actually in the trial. When deciding whether or not to include interim analyses, there's a trade off between the benefits and the disadvantages. The benefits are mainly in terms of efficacy and administrative gains. For efficacy, if you stopped for futility, you don't waste resources that could be used on other therapies. If you stop for efficacy, you could provide patients with access to the treatment more quickly. For safety, of course you don't expose participants to unnecessary harm. We don't tend to think about the administrative side as much, but it's also very important. We could use this information to decide what steps we need to take to get ready to market the drug in the field. For example, build a factory, train a sales force, create a distribution infrastructure or we could prepare to stop research in that field and focus on other fields and create grants and other projects to move ahead. Now, of course, there are disadvantages. If you stop early, you're making a decision based upon a very small amount of data less than you originally planned. So you increase your chance of making a mistake. Now there are two types of mistakes you can make. You can make what we call a type one error where you reject the null when it's actually true. So for example, you could say that two treatments were different when they actually worked. The second type of error called a type two error comes when we fail to reject the null hypothesis when it is incorrect. In this case, we would miss a treatment that actually was superior and say that it wasn't. We also have a loss of knowledge and the opportunity to achieve some of our goals. We may not have a complete side effects profile. We may have a very small sample size, which would mean we couldn't look at subgroup analyses, looking for differences between, for example, males and females or other characteristics that we might think are important. The evidence may not be considered convincing by the medical community. If your results are different from what they expect, then it will be harder with a smaller sample size to convince the skeptics. From a drug industry perspective, there also may be restrictions on the labels from not having some of this information. The third point I want to bring up is the fact that we must assume that the data we actually collected is representative of the population as a whole. Now it's possible that there are delayed effects that we might not see with a shorter follow up or fewer people. In addition, participant characteristics may change over time. If you have a study that is competing with others for recruitment, your population may change as those studies open and close. For example, if a study recruiting healthier participants completed, you may suddenly get more healthy participants whereas you might have started with sicker participants. So you have to make sure that the people in your trial are representative. You also have to assume that the data you have collected are representative. It's always possible that there are delayed effects, either in terms of safety or efficacy, that you won't see with a shorter trot. In addition, the characteristics of the participants you recruit may change over time. For example, if a competing study is recruiting all the healthy patients at your clinic, then once that study is done, you will get a different distribution of the population than you did at the beginning of the study. Finally, it's more difficult to analyze and interpret the results when you stop early. If the interim analysis is based upon an unadjusted statistical model not taking into account things like age, gender race that you plan to adjust for in the final model, you may get a different result from an interim analyses than the final model. So it's important to make sure that your interim analyses matches your final analyses. In addition, there is the potential for bias or overestimation of your treatment effect. I also want to discuss some of the barriers to performing an interim analyses. The first is the issue of enrollment versus endpoint maturity. You do not want to plan your intern analyses so late that all of the participants have enrolled by the time you would perform the analyses. At that point, everyone's been treated. You also may be forced to use surrogate endpoints as opposed to the primary efficacy endpoints. Now these surrogates are substitutes and not what you will use in your final evaluation, so there may be differences. If you have to use a surrogate to determine whether to continue or not, and that surrogate does not match your final outcome, there could be a problem and you could get results that differ. The second issue is data timeliness. It's a lot of work to clean, evaluate and monitor the endpoints that you collect. For example, you might have a centralized adjudication of images. Before you can do your interim analyses, you will need to get the data back from that centralized adjudication board so that you can actually make your decision. So how do we decide what to do? Well the key is balancing knowledge versus patient care. Do the advantages gained by stopping early, outweigh the disadvantages and potential risks. Now, the types of interim analyses that you choose to do, will depend on a number of different factors. How likely is it that you would stop early? If it's unlikely then why do it? Is it feasible to do an interim analyses? Do you have the appropriate timeline data collection mechanism and capabilities to perform the analyses? The stage of design can play an important role if you're in an early trial where you just want to find out information about a new treatment or device. You probably don't want to stop early. However, if you are later on and confirming or proving that a device or medicine works, then stopping early maybe more important. It also depends upon the availability and efficacy of other treatments. If no other treatments are available, you may be willing to continue on with a drug or device that is not looking too promising in the absence of any other options. By comparison, if there are many other options available, you may be more likely to want to stop early. And finally, there's the question of the severity of the disease. In cancer for example, we would be more likely to stop a trial early for futility because there's urgency in getting the participants on to an effective treatment. I just want to give you an example of a trial in which there was some controversy about the interim analyses. This is the study of acid reflux in children with asthma called SARCA, which I worked on as the biostatistician. The motivation for the trial was the fact that asthmatic patients were being prescribed proton pump inhibitors to treat asymptomatic gastroesophageal reflux known as GERD, which is common in asthmatics. The question being raised however was, does this treatment actually help control asthma? And we should note that this was a common practice at the time. The design recruited children with poorly controlled asthma without symptoms of GERD and randomize them to receive either a placebo or a proton pump inhibitor. The primary outcome was the asthma control questionnaire, which measures the severity of their symptoms. The original monitoring plan called for a single interim efficacy analysis. The goal there was to stop if the proton pump inhibitors were either proven to be better or worse than the placebo or no treatment. The DSMC requested during the trial that we add an ad hoc futility analysis. The data safety monitoring committee, which is responsible for independent oversight of the study, requested that we perform an ad hoc futility analysis. Their motivation was that a recently completed trial in adults showed no difference between the placebo and the proton pump inhibitor and the accumulating data in our trial looked similar. However, there are a number of arguments against stopping the trial for futility. First, there was no harm to the participants. We were not changing their asthma medication and they did not have any symptoms of GERD and therefore did not require direct treatment. The second point was, but since this was a commonly prescribed medication, we would need overwhelming evidence to change people's practice. And if we stopped early, we might not gain that evidence. And finally we were performing tests to determine whether or not these children had a symptomatic GERD and there would never be another trial that would be able to as precisely measure the proportion of children with that disease. At the end of the day, the decision was to continue the trial as designed. Although we had done the futility analysis because of the reasons described above, they were never examined and we're not used to make the decision. This is an example of why it is so important to consider all factors before including or adding an interim analyses. So the things we should consider when planning an interim analyses, this is really a team effort. It's not just a job for a statistician. Many people think of it as a statistical decision because the rules are generally made up based on statistics. However, as you can see the pros and cons of weighing the evidence that you would have and the risk of the participants also belongs in the hands of the clinicians and other people involved in the trial. The second question you have to ask yourself, is when would you actually want to stop? And you should consider this before you implement an interim analyses, as I showed you in the example before we initiated an interim analyses and then realized it was not appropriate. So be careful to make that consideration first. Do not just assume that whatever interim analyses or rules that you use for one trial will automatically fit the next trial. It's not one size fits all. You should definitely consider what are you giving up? Are you giving up knowledge? Are you giving up your type one error? Your chance to make that mistake and incorrectly say that something works when it doesn't, finally consider the feasibility of implementing your stopping rules. That question of data maturity versus recruitment or surrogates versus final outcome should also play a role. [MUSIC]