Now randomization is our key method to prevent confounding and subsequently bias. Let's jump right in and discuss randomization. Randomization has an explicit method that uses the play of chance to assign participants to comparison groups in a trial. It's a process where treatment assignment follows no describable deterministic patterns, but really a probabilistic distribution. Each new study participant has a 50/50 chance to get one of, for example two interventions in a two-armed trials. Before we can randomize we really need to make sure that we've established clinical equipoise. Curtis Meinert defined clinical equipoise like this : A randomized trial arises from a climate of clinic equipoise. It's ethical base rests on the presence of a collective state of doubt regarding the course of treatment, sufficiently balanced to justify randomization. There must be serious doubt as to which of the treatments is better, without that doubt, there can be no randomization because we cannot randomize people to a treatment that we know or suspect is inferior relative to the comparison group. Why do we randomize? We want to minimize confounding for known and unknown factors that may cause the outcome, and we want to minimize selection bias. By assigning participants to one group or another in a way that affects the outcome, selection bias can happen. For example you could imagine that you give the healthier or the sicker patients the treatment that you think is better. This can be prevented by randomization because not you as the investigator or the study participant decides on the treatment but the randomization does. The effect of randomization is that it produces comparable treatment groups but it also defines the entry into the trial. From that point of randomization on, you're in the trial. You are contributing to the data that will make up the trial results, and you are exposed to the study treatment. When I'm referring to study or trial treatment, I always mean the active treatment and the comparison group which might be another active drug, or it might be a placebo. Randomization also helps us with assumptions for our statistical tests. Does randomization eliminate all confounding or bias? Not all bias is minimized by randomization, so you may imagine that there are baseline differences that are due to chance. Remember, this is a probabilistic distribution. Just by chance, you may have differences between the groups at baseline. There might also be differences in dropout proportions between the groups. You may imagine that participants on a pharmaceutical treatment experience some side effects and may drop-out of the trial more frequently than participants on another drug or on a placebo. You could also introduce bias yourself, when you measure outcomes differently between people or between the groups, so the outcome should always be measured in the same way. One question that comes up often is, when we see baseline difference is, is this a failure of randomization? No. It is not a failure of randomization, because we would expect differences between the groups and these may be more pronounced when we randomize fewer people. Imagine you randomized 10 people, you would expect quite a number of differences between your two groups of five and five. When you randomize 2,000 people, then those differences will probably be much less pronounced. We see those differences typically, in the table 1 of a scientific paper, that shows those baseline characteristics like weight, or height, or a certain biomarker, or something along those lines. Its current convention to not attach a p-value to the table 1 of a clinical trial publication, and the reason is simple, we expect there to be a few differences between the groups. We know that these differences are due to chance, because we've randomized. There is no use to show that these differences are truly due to chance, we know that already. Let's look at a real world example. A friend of mine sent me an email and explain a study, evaluating a new procedure versus the old procedure in an intensive care unit setting. This is how he described his procedure. During the day, the new procedure was usually used. At night the physician was often alone without support staff and did not use the new procedure very often. The result was a one-to-one randomization, new procedure versus the old procedure. Is this randomization? Is this a random procedure? No, it is not. The time of day really determines what group you're going to end up in, so this is not a random process, but it is predictable. It's not randomization. There are three types of randomization that I want to discuss today. There is simple randomization, where every new assignment is completely random, and they are restricted forms of randomization that have certain properties that are interesting for us as clinical trialists, and there's adaptive randomization. I will very briefly discuss two forms of adaptive randomization a little bit later on. But let's start with simple randomization. Here, each assignment has the same probability of selection, and it's like flipping a coin. It's not truly flipping a coin, but the software does something like flipping a coin for us that is really based on random number tables. The advantages are that it's a simple way to randomize, it's easy to do, and it is truly unpredictable. There are disadvantages too simple randomization, though because it is not reproducible given that you're truly flipping a coin. When you use random number generators in your software, you can make this reproducible by planting seeds in the software. There's a risk of imbalance regarding the allocation ratio and baseline characteristics. The allocation ratio may be uneven so you may randomize a different number of people to one group versus the other just by chance, this might happen, and we'll see in an experiment in a minute. We already talked about baseline differences that may happen when we randomize. I did a little experiment here. I randomized people into two groups in a one-to-one fashion. I did 100 assignments using an axial random number generator, and I did 50 tests. Out of those, only two ended at 50/50, where I had the same number of study participants in one group versus the other. Four ended with 60 or more people in one group and the maximum was 68 people in one group and 32 in the other group. This can happen with simple randomization because we use the play off chance. There is a way to prevent this, and this is restricted randomization, and in particular, blocking. Let's discuss this now. Here at least some assignments are determined from prior assignments, and the example here is a block of four. We define the block size and how the treatment groups are distributed in the block is random. You could have AABB, or ABAB, or ABBA, or BBAA, and other combinations. This is random. The advantage is that you have a much better balance, because the allocation ratio is much more even. At the end of every block, the allocation ratio is one, meaning, we have the same number of study participants in one group versus the other group. But the disadvantages is, if you don't mask well then this is more predictable right. The last assignment of each block is always predictable if you don't mask people. You know that A happened and B happened, then we know the last assignment has to be B, because in each block, you have the same number of A assignments and B assignments. In practice, our randomization tables are often composed of a mix of block sizes. For example, a mix of blocks of size 2 and 4, like here. Here we have three blocks of four, and then three blocks of two. We have AABB, BABA. Those were two blocks of four. Then we have a block of two, AB. Then we have another block of four BAAB, and two blocks of two. This is much more difficult to predict and the blocking scheme, meaning, the size and the number of blocks should be kept confidential or changed randomly, just to make sure that people can't guess what the next assignment is going to be. Advantages of blocking are, minimized imbalance. We even achieve complete balance at times, and this guards, for example , against temperature changes. Assume you are testing a vaccine when the target disease is most prevalent, just by chance, most people are assigned to the placebo vaccine. You don't want that. That happens when you have pronounced seasonality, and our blocking mechanism can guard against this. The disadvantages are that these schemes are more complex to construct, and that some assignments may be predictable if you communicate your blocking scheme, and if you don't mask properly. We will talk about masking a little later. The smallest block size is the sum of the factors in the allocation ratio. In a one-to-one scheme the smallest block is two. If you have a one-to-two-to-one scheme, the smallest block is four; 1 plus 2 plus 1. If you have a scheme that is one-to-two-to-three, so in your second group, you include two times more people relative to your first group, and then three times more relative to the first group your smallest block size is six, so the sum of those factors. Larger blocks should be multiples of the smallest possible block size. Again in the one-to-two-to-three scheme, you could have block sizes of six or 12 or perhaps even 18. I would always advise to not go too high with the block sizes such that you can really achieve the balance that you want. There is another way to do a restricted randomization, and that is stratification. Where we subgroup assignments based on characteristic measured before the assignment. Usually, these characteristics are related to the outcome of interests. That could be gender, or age group, or it could be a biomarker for example. Typically we would also stratify by study site in a multicenter trial. Stratification really means that we create a separate assignment schedule for each value of each stratification variable. An example would be a multicenter study with 10 sides, that stratifiers by site, and we'll have 10 separate assignment schedules one for each site. If the same study also wants a stratified by age group, say an adult versus a pediatric population, it will need 20 separate assignments schedules , 10 times 2. Each assignment schedule is created independently, so stratification or stratified randomization is our attempt to achieve comparability between groups regarding the most important prognostic factors that are called stratification variables here. Typically, we use this together with blocking, and it's advisable to minimize the number of strata because they might get small and imbalances might result. I would only use this technique for the most important factors, those variables that you really want to be evenly distributed between the groups. A word on adaptive randomization, I have two examples here. The first one is response-adaptive randomization, where use interim data to unbalance the allocation probabilities in favor of the better treatment. Sometimes we call this playing the winner. Imagine, you have a four arm trial to treat a rare disease, and one arm has a significantly worse result relative to the other arms at a pre-specified interim analysis. This arm could be closed down moving forward based on rules that you pre-specify before you start your trial, and the trial might continue with three arms for example. This is why this is called response-adaptive because it uses outcome data to unbalance the allocation ratio. This is different from covariate-adaptive randomization where we can balance the group composition with respect to baseline characteristics. For example age, or height, or weight, or biomarkers. In this covariate-adaptive randomization we use the baseline data to balance our groups. Your software will do this for you and an example could be you really want to make sure that your two groups are balanced with respect to age. In this case, as soon as one group gets older than the other, then your software will shuffle the next younger study participant into that older group to bring balance to those groups with respect to age. This particular technique or method is called minimization because it minimizes the differences between the groups