I'd like to highlight some of the differences between monitoring safety and efficacy data. We have different motivations for monitoring safety and efficacy data. For efficacy data, our primary concern is making the wrong conclusion. This could be saying that treatments are different when they're actually the same, called a type I error, or it could be saying they're the same when they're actually different, a type II error. We're also concerned about the loss of knowledge and the opportunity to achieve all of our goals. We could have an incomplete side effects profile, it could restrict our subgroup analyses, the evidence may not be considered convincing, or there may be label restrictions. The motivations for safety data are different. Our primary concern is participant harm. Is the participant being put at an undue risk by being in the trial? Now, this could be on a relative scale, one treatment versus another in the trial, or it could be on an absolute scale where there's a threshold for how much harm would be allowed. Our loss function in this case is that we're more worried about missing a signal than incorrectly saying there's a difference. Now, this means that there are different monitoring techniques. For efficacy data, we have formal statistical monitoring at prespecified intervals. With that, we have prespecified criteria for stopping. Typically we include summary statistics and figures examined throughout the study to ensure that we can accurately assess the risk-benefit profile. You can't weigh the risks versus the benefits if you don't have both pieces of the puzzle. This is also done to ensure that there's data quality monitoring to observe outliers or unusual values that might not be expected and might not be accurate. Safety data is typically monitored continuously with statistical testing. There could be prespecified rules for stopping if there are known risks. There could also be ad hoc rules for stopping. As a United States Supreme Court Justice, Potter Stewart said, "I'll know it when I see it." Now, he said that with respect to pornography, we say that with respect to safety. We'll recognize something concerning when we see it, and we want the freedom to stop and prevent harm. Now, there are different tiers of safety data and types of adverse events. Tier 1 are generally prespecified in the hypotheses. These are known risks, for example, weight gain or systemic side effects with oral corticosteroids. Tier 2 are safety issues noted during the trial. Although hypotheses are not pre-planned, if they occur with high enough frequency during the trial, they could be a concern. An example that recently occurred in a trial that I am working on was the discovery that the needle point on the syringe delivering the medication can sometimes break off, and so it became very important to us to monitor whether or not this has happened, and whether or not there were any bad side effects. Tier 3 includes rare reported events. These are events that are not expected and do not occur frequently, but may be serious enough that we want to keep track of them. Now, different types of analyses are used depending upon the different tiers. Formal statistical hypothesis testing is typically done for tiers 1 and 2. For a single event, this could include a Kaplan-Meier curve to look at the cumulative incidence of the event, or a Cox proportional hazards model. For multiple events that could occur repeatedly over time, we might use a negative binomial or Poisson regression. Summary statistics and figures are typically used for tier 3 since not enough of them occur to do formal analysis. These are also sometimes included for tiers 1 and 2 to get a better picture of what's going on. I want to give some tips for monitoring safety data. First, there's the question of data collection. It's preferred that you ask all participants at every visit whether or not they've had an event. This avoids preferential reporting bias, asking only one group whether or not something has happened. That would make it more likely that that group would have a higher number of event just because you asked. If you ask everyone at each visit, all treatment arms receive the same scrutiny and you're less likely to have bias. An alternative is to have self-reported or incident findings. Now, this may be necessary if an event is unexpected. Of course, if you see an unexpected event that occurs regularly, you can always switch to regular reporting. The next point I want to make is that it's important to monitor both relative and absolute risk. They show different things. For example, the relative risk of two drugs may be very similar, but if the absolute risk, how frequently it happens, is unacceptably large in both, that's still a problem, and you would miss that if you only looked at the relative risks. The second scenario is if the relative risk is different, but the absolute risk is very small. In that case, you're not as worried because not many participants are actually at risk.