This lecture is first going to review multi-level and longitudinal models. Next, we'll explore making decisions about how to choose an appropriate tests for our data analysis. The learning objectives for this lesson include: reviewing multi-level and longitudinal studies, and describing which hypothesis test statistics are used for these models. Additionally, we'll explore criteria for comparing the various tests, and learn how to choose the test aligned for power and sample size analysis. Some common models that are used for multi-level and longitudinal data are the multivariate model and the mixed model. The mixed model pretty much change the way we thought about handling clustered, multi-level, and longitudinal data. So, it was really a revolutionary model. Note, this lesson will not discuss how to fit the models for data analysis procedures. The concept of a reversible mixed model is one that has both random and fixed effects, that's what makes it mixed. There are certain types of mixed models that can be expressed as multivariate models. The mixed model is more general. All multivariate models are special cases of the mixed model. If they meet the conditions, the mixed model is referred to as reversible. GLIMMPSE assumes that the independent sampling unit has the same number of units of observation. Further, these units of observation are measured at the same times, locations, and for the same variables. So, in a longitudinal design, we could now measure at six months for one group, and eight months for another. Also, we are assuming that the predictor variables are only measured once in the model. For example, if we measure reading comprehension as a covariate in some educational research, we would only account for this one time in the model. There are two different approaches to test for multivariate models. The univariate approach to repeated measures, and the multivariate approach to repeated measures. The univariate approach assumes the same measure using the same scale at the same intervals. For instance, we could examine alcohol intake measured at three, six, and nine months. This is using a single measure on multiple occasions. The multivariate approach provides more flexibility, enabling the researcher to examine multiple different measures such as height, weight, circumference, at a single point or multiple measurement points, longitudinally as a repeated measure. Again, the null hypothesis makes a claim that two events are not related, such as the difference between a treatment group and a control group. The p-values that are small, lead researchers to reject null hypothesis. The Type I error rate occurs when the tests mistakenly rejects the null hypothesis when in fact it is true. We can set our alpha two different values such as the norm at 0.05, or even 0.01. The p-value is the probability of seeing data more extreme than the current data if the null is true. For instance, referencing an F statistic from a one-way analysis of variance, we are testing whether there are differences among groups in the model. Before we can do our analysis, we have to set our nominal Type I error rate, which is the number the researcher is willing to accept. We can calculate the Type I error rate using simulations which can be expressed as the number of rejects over the number of experimental replication. You could generate normally distributed data for five groups and perform calculations using the F statistic. Some testing paradigms have accurate Type I error rate whereas others do not. This is an important note. So, keep it in mind. The problem is when it is bigger. If we say that we are testing at 0.05, but we really test at 0.25, then we have a problem. Lack of reproducibility is a major problem in science. If scientists are drawing the wrong conclusions, then we're not meeting our scientific goals and objectives. These are not just statistical problems either. These are problems that can directly affect people. For example, in health science, you might have mistaken only approve a drug that is ineffective, resulting in all kinds of issues. Sciences based on being able to replicate bindings. Without this, there could be a loss of trust in science, which is not good for science or society. Here you can see an example of this. One study says cranberry causes cancer. Another says it does not cause cancer. Then the public does not know if cranberries are safe or not. This ability to replicate findings is part of the scientific process. Part of moving towards causation is seemed consistent results over time. So, the more consistent the results are, the more faith we have in those results. This is something researchers must accept. Type I error is a real thing and must be considered when conducting studies. This is why consideration of claimed Type I error rate is imperative. A uniformly most powerful test achieves the greatest power among all reasonable tests that have the same size. By size, we mean the size of the Type I error. This only exists for certain cases such as the general linear hypothesis in a general linear model. Under certain conditions, there will be different tests that we have, what we call a coincidence. This occurs when they give the same exact p-values and lead to the same conclusions. They also produce the same power and sample size when you design a study. Coincidence is what really underlies GLIMMPSE, and how it really works. For example, coincidence between the tests that are used in a general linear multivariate model and a mixed reversible model, can really simplify the process of selecting the test. Here are two criteria that we suggest for evaluating tests. We want to choose tests that have actual and claim Type I error rate that are close to each other. Also, we want to choose a test with accurate Type I error that gives the highest power. We don't want to settle for low power just because it gives an accurate Type I error rate. Instead, we want to go for the maximum power that we can get. Let's talk about choosing an appropriate test. In GLIMMPSE, there are several types of tests. There's the univariate approach to repeated measures and the multi-varied approach to repeated measures. Also, the reversible mixed model hypothesis tests relies on what we call a Wald test. If we use the Wald test within particular modification to it, you can think of it as similar to a t-distribution, but with different degrees of freedom. That's the idea behind this Wald test in the Kenward-Roger degrees of freedom. Most of your decisions about using tests can be made using this simplified map. Your decisions come down to the reversible mixed model with the Wald test and the Kenward- Roger degrees of freedom, which modifies the degrees of freedom, as well as the covariate matrix. We'll talk about correlation and covariance in detail in another lecture. Your other option is the multi-variant model through the Hotelling-Lawley Trace test. Most of the time, that's going to be perfectly fine as a test for comparability as a whole. Just so you're aware, we're focusing on some heuristic, kinds of rules of thumb that are going to apply in most, but not all situations. Here you can see that the Hotelling-Lawley Trace tests can work in many different scenarios, which is why we often recommend using it. Again, these are only for reversible mixed models. So, if we use the Wald test with the Kenward-Roger degrees of freedom for analysis, it is going to approximately coincide with turning the data into a multivariate framework and applying the Hotelling-Lawley Trace tests. These tests coincide under the conditions of no missing data. The independent sampling unit have the same number of observations and no repeated predictors in the model. Because it coincides, most study designs can use the Hotelling-Lawley Trace. Let's quickly summarize what we went over in this lecture. Often, we use a multivariate model or mixed model for the types of studies we're discussing in this course. When comparing tests, look at the actual Type I error rate, and how it compares to the claimed to Type I error rate. We want these as close as possible, so we are able to produce replicable findings. For tests that have actual Type I error rates, you want to choose the one with the highest power. We often recommend using the Hotelling- Lawley Trace test, as it is applicable to many different scenarios. That's it for this lecture. Thank you for your time.