Greetings and welcome back.

In this section, we'll expand our toolbox to allow

for binary outcomes when doing regression models.

We're going to turn our attention to something called logistic regression.

As I noted in the overview,

in the last beginning,

in the last lecture set,

the scaling and which we do this is a little different,

we start with a binary outcome.

Did the person have disease? Yes or no.

Did the person finish their program successfully? Yes or no.

Ordinarily, if we were summarizing it the way we did in the first term,

we'd estimate the proportion who

had the outcome for the different groups we're comparing.

This will be what we're doing here as well,

but we're not going to model that proportion as a linear function of our predictor.

Instead, for reasons that we'll get into about halfway through the lecture set,

we're going to have to transform it twice to a new scale.

It's actually a scale we've worked on before in the first term,

so it shouldn't be too crazy to see it like this.

But we're going to transform the proportion to the odds,

to the log odds.

What we'll be estimating as a linear function of our predictor is the log odds.

You might say, "Well, why me?

Why that's useful in all?

Do we think of things a log scale? " The answer is no.

We don't think of things on the log odd scale,

but we have an equation to estimate the log odds for any group given their x value,

that's a linear function of some intercept and slope.

So, we will again see that this log compares

the same two groups that will be compared in any other type of regression.

Two groups will differ by one unit and an x value.

But now, it's going to be comparing or taking

the difference in log odds between those two groups.

Well, you may recall that a difference in log odds is

mathematically equivalent to a long odds ratio.

If we exponentiate the results for the slope,

we get an estimated odds ratio,

something that we've worked with before.

Additionally, these equations can be used to estimate

the log odds of the outcome for any single group given their x value,

you just plug in x to the equation intercept plus slope times x value.

We'll get the estimated log odds of the outcome for

any group defined by the next value and we'll show that.

Well, if we have the log odds,

we can transform that back to an estimated odds.

If we have the odds, we can transform that back to an estimated probability.

So, even though everything is done initially on the log odds and odds ratio scale,

we're not beholden to that scale.

We can get things that in estimated,

associations in terms of odds ratios,

but we can also get predicted probabilities or

proportions of those having the outcome for different groups given their x values.

So, hopefully, you will see the connections between this and linear regression,

and more be to put off by the fact that initially,

things are done on the log scale.