Okay. So, now we're going to get a chance to look at a web application and illustrate some of the techniques that we've been referring to and the concepts that we've been referring to regarding multilevel models. So, what we have up here right now is a web application, you can find this website together with the materials for this week, so that you can play around yourself. But this is a web application that was developed by the Cal Poly Statistics Department specifically we want to acknowledge Jimmy Wong who designed this app. This is an app that allows you to play around and practice a little bit with some of these key concepts, about multilevel models. So, we're just going to walk through this app and see see what happens here. So, you can see that Jimmy in this app refers to this as hierarchical modeling. That's another name recall for multilevel modeling. You might hear these referred to as hierarchical linear models. A key point that's illustrated in this app, is this notion of a pooled method tantalizing data versus an unpooled methods. So, in a pooled method, we just analyze all the data together and we don't explicitly account for the random effects of clusters. We just assume that, everybody follows the same general effects. The unpooled method we include an explicit fixed effect for every single cluster. So, not random effects, but rather, we're doing a unique analysis for each of the different clusters. That's what's referred to as the unpooled method. Multilevel models as noted here are a compromise between those two approaches, completely ignoring the effects of the higher level units, versus doing a separate analysis for each of those higher level units. Multilevel models with random effects represent a compromise. One of those points they emphasized in this application, is this notion of borrowing strength from the information across clusters and shrinkage, where estimates from smaller clusters are not going to have as big of an impact on the overall model estimates that we're computing. So, they again, refer to the multilevel idea of having level one units, the units observed at the lowest level and then level two units are the groups at the higher level in which the level one units are nested. So, in this application after this goal page, you can then go to this tab that says "Select Data" and they have a data set that's built in. So, this is a sample data set in a music application, this is actually from a peer reviewed article, where they collected data on 37 undergraduate music majors who filled out performance theories over the whole academic year. Before each of the performances, each musician completed a positive affect, negative affects, schedule or the PANAS, in which two different variables were measured. Negative affect which is a measure of anxiety before the performance and positive affect which is a measure of happiness. So, in total we had these variables measured on the musicians and each of their performances. So, this is the multilevel structure of the data. You can see in the table here, you have multiple observations on the same ID. So, ID or the musician is the level two unit and then you have multiple diaries collected from each ID, so that the observations and the measurements on the diaries represent the level one observation. So, that's our multilevel structure. So, in the app, if you use this data set, the analysis is going to focus on the dependent variable as being negative affect. You'd like to develop a multilevel model for negative affect, where you're predicting negative affect with characteristics of the musicians and characteristics of the performances like the type of performance, the audience and so forth. So, these are potential predictor variables that you see in this particular table. Now, in this app as an alternative, you can also upload your own data. It's a really cool feature of this app, you can upload your own data sets, they might become separated or tab separated, but you can bring in your own data sets and then fit multilevel models to those data sets. So, that's a very cool feature of the app. When you bring in those data sets, you can customize the models that you're fitting. So, you can pick your dependent variable, you can pick the level two cluster ID, the level one predictors and so forth and so on. For this illustration, we're just going focus on the musician data and the example models that they already have built-in for the data. Okay. So, then we can go to the case studies tab of the application and you can see that they have three different examples here. One, where you have random intercepts. So, you have a random effect that allows each higher level cluster to have a unique intercept. One where you have varying intercepts and varying slopes. So, those are the kinds of random coefficient models that we've talked about in the lectures especially for the ESS data for example, where you have random effects allowing the intercept to vary and random effects allowing the coefficients to the slopes of particular predictors to vary. Then third, they have a case where you have a varying intercept and a varying slope and they tried to explain variability in those random intercepts and slopes, with a higher level predictor like a feature of the musician in this application. So, just to have a simple example and building on the NHANES smoking example that we just looked at, we're going to look at a case study with a varying intercept model. So, we just have one random effect that allows the intercept to vary for different higher level clusters in this case, musicians. So, when you choose one of these case studies, you see some nice visuals in the app. The first tab of these visuals here shows what you would expect to see under the pooled methods. So, again, what we're not accounting for any of those random musician effects are random cluster effects more generally. This is just a histogram of the dependent variable negative affect with the pooled mean imposed. You see that vertical dashed line, that's the pooled mean of the dependent variable negative affect and you see the general distribution here. So, this pooled method where we're ignoring the random musician effects completely pools all those level one observations and ignores the nesting of diary measurements within the higher level units or the musicians. So, the pooled mean is simply the overall mean of the dependent variable, not accounting for the variance among the level two observational units. So, you see the overall mean is about 16.2. Now, if you click on the unpooled tab, now you see a visualization where you see a separate box plot for each of the different musicians. We're doing a different analysis of the dependent variable for each of the musicians separately from each other. So, you can see from this box plot, there's a ton of variability among the musicians in terms of this negative affect before performances. Some tend to have a lot larger values on average. Some tend to have a lot lower values. Some have a lot more variability than others, but this is a nice visualization that shows that between musician variants in these data. So, it seems like a random affects approach to account for this variability would make a lot of sense here. So, you can see this nice visualization and down below in the table, you can see the mean and the variance in addition to the sample size for each of the different musicians. So, very nice descriptive feature of the app in the sense. Then we can click on the hierarchical linear model tab and now you have all different kinds of tabs to look at different aspects of this multilevel model. So, you can look at the model equation for example, remember that we wanted to fit a random intercept model. So, you see that the mean is defined by a random intercept exclusively in addition to a fixed effect. So, the random coefficient is defined by the overall mean which in this case, is mu alpha and then the random effects are denoted by ada here, ada j. So, slightly different notation that we are introducing, but the same general concept. This is the multilevel specification, you can click on this box to show the single level equation. So, you can see the equivalent specification in just a single equation, relative to the multilevel specification. You also see the clear definition of the distributions for the random errors and the random effects here. So, these are the two variants components that we would want to estimate. Then you can click on these other tabs like a caterpillar plot. This shows, for example, what the distributions would look like of the random effects. So, when we calculate the random effects, you see the distributions, the sampling variance for each of the random effects. Again, this plot illustrates the variability among the groups or the musicians in this case. There are various other calculations that you can perform these tabs, for example, you can look at the distribution of the estimates, in this case, this is a histogram of all of the different means for the different musicians along with the overall mean. You can calculate the estimated correlation within individuals. So, this is called the intraclass correlation coefficient as a function of the different variance components. So, if we show that calculation, we see that the estimated correlation of negative affect observations within the same musician which is being captured by those random musician effects is about 0.181. So, there's a pretty strong correlation of any two observations nested within the same musician. So, all these different descriptive aspects of the model that we're fitting here to get a sense of that variability. Then in this other tab, you can actually look at the output from fitting this model, that's called the HLM output here. You see some computer output automatically generated by the app, which shows the estimate of the fixed intercept, we haven't included any predictors yet. So, the overall mean is about 16.2, and you see the estimated variance of the random musician effects it's about 4.95, and the estimated variance of the residuals which is about 22.46. So, you also see output from fitting that model and looking at the different estimates. Then we can click on this tab that says comparison of methods, if we want to see how these different methods varying intercepts, varying intercepts and slopes compared to each other. Now, so far, we've only fitted one model, but we can try these other case studies to see what would happen here. Then in the larger analyzed data tab, here is where you can customize the model that you're fitting. So, for example, we could add a level one predictor variable, some kind of musician level predictor. So, for example, maybe we want to account for performance type as a level one predictor. Then we can say to fit the model by adding that level one predictor and the output you see is going to be updated to account for that predictor that we added. So, now we see an estimated fixed effect of performance type. Now, we can see how our different estimates changed once we added performance type to the model as a predictor and so forth. Notice, in this case, that actually we get a message about convergence. So, we can't really trust these estimates because when adding that predictor, we didn't converge to a solution. The maximum likely hood estimation didn't necessarily converge to a solution. You see some suggestions for what you might try there. So, you can change all of these different tabs depending on whether you're with your own data set where you're analyzing the musician data set, you can try these different combinations of fixed effects and random effects in the analyzed data piece. So, just as a final illustration, suppose we wanted to add random slopes as well, I can click on this case study, and now if you start to look at the data, in the pooled method, here's the relationship of the variable previous in the data set with negative affect doesn't seem like there's a strong relationship in this scatter plot. That's the pooled method where we pool all the data. In the unpooled method, we have a separate regression line for every single musician. So, you see those different regression lines for each of the individual musicians. This plot shows how there's evidence of variance among the musicians in terms of this relationship. So, this is visual motivation for including a random effect that allows the relationship of previous with negative affect to change depending on the musician. So, this is a good, again, visual motivation for adding a second random effect allowing that slope to randomly vary for different musicians. Again, we can look at the hierarchical linear model tab, look at the graph, look at the model equation and that we're thinking about, now we're including that predictor variable x, so now you see two level two equations for the random coefficients, one for the intercept alpha j, one for the slope beta j, the assumed distribution for those random effects, the two variants components that we're estimating and then the variance of the errors associated with the level one observations. Again, you can look at the HLM output from fitting that model, make sense of the relationship. So you see, the fixed effect of the predictor here is estimated to be about negative 0.11, not quite significant at the 0.05 level. So overall, it doesn't seem like there's on average, a strong effect of that predictor, but we're really interested in the variance of those random slopes is 0.07816 and whether or not that variance is significant. So, we could perform a likelihood ratio test, to get a sense of whether the variance is important. So, this app is very cool because it illustrates the underlying equations, it allows you to visualize the data, it automatically fits models for you and generates the output and the app's freely available. So, this is a very nice contribution and again, I thank the Cal Poly folks for putting this together. So, again, there's a web link this week where you can play around in this app, try different models, try different predictors, try different dependent variables and learn a little bit more about how multilevel models work using this free tool.