0:04

Now, we've talked about how to simulate random numbers from

Â simple probability solutions.

Â But the question now is how, what if we want to assimilate data from a,

Â from a model.

Â So for example, like a linear model.

Â So I've got a fairly simple linear model here.

Â It has a single predictor, x and it's going to have random noise, what I

Â call epsilon that, that has a normal distribution with standard deviation two.

Â There is, the outcome is going to be generated by, by, use, using these two

Â regression coefficients around intercept beta knot and, and a slope beta one.

Â And I've got I'm going to assume that beta knot is equal to 0.5 and

Â beta 1 is equal to 2.

Â So the question is,

Â how do I simulate from this model now that I've specified what it is?

Â So I here, at first I set the seed.

Â It's always very important to set that seed.

Â So I set it to 20.

Â I generate x the predictor, which is, has a standard normal distribution.

Â I generate epsilon, which is going to have a standard

Â a normal distribution with mean zero of standard deviation two.

Â And then I'm going to add them all together by,

Â and after multiplying the regression coefficients to generate my y.

Â And so, from the summary here, you see that y has roughly a mean of 0.68.

Â And it ha, and it ranges from about minus 6 to plus 6.

Â And then I can plot the data to see what they look like.

Â And here they are on the next slide.

Â 1:20

So this is the plot of the x that I simulated.

Â And the y that I simulated from the linear model.

Â And you can see that they very clearly have a linear relationship according that

Â follows the model that we specify

Â 1:35

So just a slight variation of the previous example.

Â What if x is a, instead of x being a normal random variable,

Â what if x is a binary random variable, so member it, maybe it represents gender or

Â maybe it's some treatment versus control or something like that.

Â So here, and it's very simple, I can generate binary data from the,

Â using the binomial distribution and the rbinom function.

Â So, I set the seed again.

Â And I generate a 100 binomial random variables and

Â these are going to have these, this, this if from, this comes from

Â the binomial distribution which is n equals to 1 and p equals to half.

Â So, the probability of one is going to be equal to 0.5.

Â So I generate a hundred of those.

Â And then I generate my normal random variables.

Â My normal error term which is going to be mean zero and standard deviation two.

Â And then I add them all together which should produce my y.

Â So now I look at the summary of y.

Â I see the mean is about 1.4, and the range is about from minus 3 to six or seven.

Â So when I, now when I plot the data,

Â of course they'll look very different, because the x variable is binary.

Â But the y variable is still continuous, it's normal.

Â So here you can see that there's, there appears to be a pretty clear,

Â again, linear trend when, between going from x equals to 0 and x equals to 1.

Â 2:50

Now suppose you want to simulate from a slightly more complicated model

Â a generalized linear model perhaps with a Poisson distribution.

Â And so, for example, you might want to simulate some outcome data that are,

Â that count variables, instead of continuous variable.

Â So we have to use a slightly more complicated approach, to do

Â this in particular, because the error distribution is not going to be normal.

Â It's going to be a a Poisson distribution.

Â And so, let's assume that the outcome y has a Poisson distribution with mean mu.

Â And that the log of mu follows a linear model with a intercept beta knot and

Â a slope beta one.

Â So x is going to be one of our predictors.

Â So let's assume that beta knot is 0.5.

Â And beta one is 0.3.

Â So how do we simulate from this model to get our Poisson on data?

Â So so we need to use the rpois function for this.

Â And so we first set the seed as always, and we generate our predictor variable, x.

Â Which is going to have a standard normal distribution.

Â Then we're going to simulate, generate our lin, linear predictor log of mu.

Â Which is just adding the slope and this, the intercept and

Â the slope coefficient times x.

Â So that's the log of our linear predictor.

Â But when we, but in order to get the mean for

Â our Poisson random variable, we need to exponentiate that.

Â So we, we simulate 100 of these Poisson random variables using the rpois function,

Â and we give it the ex, the exponential of our log mean.

Â 4:08

So when we summarize this,

Â you'll see that the mean is about 1.5 and our range is between zero and six.

Â When I plot this data, you'll see that they look like Poisson data, and

Â that there's clearly a linear relationship between x and

Â y, as x increases, the count for y generally gets larger.

Â But the data are still count variables here.

Â