So, let's look at some additional examples to illustrate the concepts we discussed regarding Cox proportional hazards regression. So, let's go back to an example we used in the first term and put it in a Cox regression context. Actually, the authors used Cox regression to get their final hazard ratio or incidence rate ratio estimate. So, this was a study done on zero-discordant couples, where one of sexual partnerships or one of the partners was HIV positive, the other was negative. They were hoping to see whether accelerated antiretroviral therapy treatment of the HIV positive partner could reduce partner-partner transmissions, as compared to the standard approach which was to wait to treat the infected partner, until their CD4 count had gotten below a threshold. The results were pretty amazing. There were a total of 39 HIV-1 transmission observed and of these, 28 were virologically linked to the infected partner. Of to 28 linked transmissions, partner to partner transmissions, only one occurred in the early therapy group, the group that got the accelerated treatment for the HIV positive partner. So, one of the 28 occurred in the early therapy group, the remaining 27 occurred in the standard treatment group. The estimated hazard ratio taking into account follow-up time and censoring in the two groups being compared, the estimated hazard ratio of transmission for those where the HIV positive partner got accelerated antiretroviral therapy versus the standard group was 0.04 with a 95 percent confidence interval ranging from 0.01 to 0.27. So, a 96 percent reduction in the estimated risk when treating the HIV positive partner with the accelerated approach, and this could be anywhere from 99 percent in the best case scenario, down to 73 percent reduction in the worst case scenarios. So, even in the worst case scenario, we're talking about quite a reduction of the risk. So, these are the Kaplan-Meier curves. These were showing the proportion who had zero converted pushed the couples where the partner had zero converted by a given time in the follow-up period, as opposed to starting at one and going potentially down the zero. They started at zero and worked their way up. So, it's the complimentary version of survival curve that shows the proportion who have had the event by a given time. You can see that even in this blown up version of the graphic, it's almost impossible to see the curve for the early group because there was only one actually event in that group. So, the underlying Cox model for the outcome of linked transmissions was this. This is what the results look like when the authors used the computer. Their model looked like this. There was this intercept component which traced the log of the baseline risk for the reference group over the entire follow-up period evaluated as a function of time plus a slope of negative 3.21 times x_1. So, in this model x_1 was equal to one with couples where the HIV-positive partners who were on accelerated treatment and zero for those who were on the traditional therapy. Also, the estimated standard error for this slope is 0.95. So, the estimated hazard ratio for transmissions from partners in the accelerated treatment group compared to those with a HIV-positive partners who were on the traditional therapy, is e raised to the negative 3.21 power or roughly equal to 0.04. The 95 percent confidence interval for the true population hazard ratio. I'm going to do this all in one fell swoop instead of steps. What we first do, is get the confidence interval endpoints for the slope, take the slope negative 3.21 minus two estimated standard errors, two times 0.95 for the lower bound and on the slope scale, the slope estimate of negative 3.21 plus two standard errors for the confidence interval endpoint for the slope or log hazard ratio in the upper bound and, then we'd exponentiate these endpoints to get the endpoints on the hazard ratio scale. If you do this, it turns out to be confidence interval of 0.006 to 0.27, which when rounded to two decimal places is the 0.01 to 0.27 as reported by the study authors. So, again, this hazard ratio that we estimated here as 0.004 comparing the hazard transmission for couples in which the HIV infected partner got accelerated treatment compared to the couples in which the HIV infected partner got the standard treatment. What is the intercept? What the intercept is, is the log hazard of linked transmissions as a function of time, time t, over the entire follow-up period for the reference group, which was the traditional therapy group. So, at any time t, if I wanted to get the log hazard for the accelerated therapy group, I would plug in the log hazard for the reference group, the intercept at that time, and then add the slope negative 3.21. So, regardless of where it was in the follow-up time, the log hazard for the accelerated treatment group was always 3.21 less than the corresponding log hazard for the reference group at any given time. So, that means that if I had the hazard for the reference group, the exponentiated intercept, then if I had that for the reference group at a given time, the way I get the hazard for the accelerated treatment group is to take that hazard at any given time for the reference group and multiply it by that hazard ratio of 0.04. So, that's just another way to talk about this relationship, once we've exponentiated things. The hazard ratio at any given time for the estimated hazard when x_1 is equal to one at a given time t compared to the hazard when x_1 equals zero- Or the reference group, the standard therapy group, looks like this and if you multiply that out, the hazard at any given time for the accelerated therapy group is 0.04 times to corresponding hazard for the reference group. So, that's again, just a way to think about what a hazard ratio is measuring. The proportional hazards assumption has to do with, either the slope of negative 3.21 on the log scale. The proportional hazards assumption is easier to draw out on the log hazard scale and then we'll just talk about what it means, again, remind ourselves what it means on the hazard scale. But the log hazard scale, there's some shape of the baseline risk as estimated via the reference group, the group who got the standard therapy whether HIV positive partner got the standard therapy, and their risk for those couples of transmission will vary over time in some fashion. I'm just making it up, it could be crazy like this, it can be much more straightforward and linear, et cetera, but there is some shape to this. So, to say this is the log of the baseline hazard has some shape, maybe not as fancy as what I've made here. But the idea of proportional hazards is on the log scale. The difference between at any given point in time between the log hazard for this reference group and the log hazard for the group with x_1 equals one, this accelerated therapy group, is constant. Their differences always negative 3.21 going from here to here, and so these curves will be perfectly parallel. Even if I can't draw them as such and I'm not doing a very good job of drawing the scale, but the distance between the two curves will always be negative 3.21. That's the assumption we've made when we estimated that constant difference of negative 3.21. If you were to exponentiate this and put this on the hazard scale, these curves would be no longer parallel but they'd be in a constant proportion where the value for the accelerated therapy group, the ratio of that to the value on the hazard scale for the baseline group would always be 0.04 at any given time in the followup period. Let's look at another example we looked at, the PBC trial where we looked at mortality as a function of that baseline bilirubin of the subject at the time, they were randomized to either get the drug or a placebo. The log hazard for this group looks something like this. There's the baseline hazard evaluated when x_1 equals zero as a function of time, the log of that is our intercept, and then the slope for bilirubin measured in milligrams per deciliter was 0.15. So, here, the slope equals 0.15 and the standard error equals 0.013. We've already seen these estimates in a previous lecture set, and we, of course, already went through and computed 95 percent confidence interval for the slope. But just to refresh your memory that operations here, the slope plus or minus two standard errors give us an interval of 0.124 to 0.176 on the slope scale. Do not include the null value for slopes or IE log hazard ratios of zero, so the result is statistically significant. Then, we did the exponentiation of the end points to get the confidence interval on the hazard ratio scale. But we may recall in the lecture, we went ahead and computed a hazard ratio comparing two groups since whose bilirubins didn't differ by one unit in bilirubin but by 2.7 units, a group with 3.5 milligrams per deciliter compared to a group with 0.8 milligrams per deciliter. We showed that the estimated hazard ratio, what we could do is on the slope scale, the difference we computed out was the slope 0.15 times the difference in bilirubin values of 2.7. So, when we did this, when all the dust settled, we were left with a hazard ratio of 1.5, comparing these two groups 3.5 milligrams per deciliter compared to 0.8 milligrams per deciliter. I said, "You could have started with the estimated hazard ratio for a one unit difference and raised it instead to the 2.75 power, instead of taking things back to the log scale." Similarly, we can do the same thing with the confidence interval endpoints. We don't have to necessarily multiply the endpoints on the slope scale by the difference 3.5 minus 0.8 by 2.7. We don't have to do that first and exponentiate, we could just start with the confidence interval endpoints on the hazard ratio scale and raise each to the 2.7 power to get a confidence interval for this hazard ratio of 1.39 to 1.6. So, let's just write this out in more detail and talk about why that is. So, if I went back to the slope scale, the original slope was Beta_one_hat equals 0.15. If I wanted to figure out what the value of 2.7 times Beta_one_hat is for the difference on the log hazard scale, B 2.7 times 0.15 equals 0.405, and I could then raise e to the 0.405 to get my estimated hazard ratio of 1.5. But again, what am I doing here if I write it out in pieces, this number here, the 0.405 is equal to 0.15 as we said, times 2.7. By properties of exponents, I can rewrite this as, e to the 0.15 power raised to the 2.7 power, but either the 0.15 power is just our hazard ratio of 1.16 for a one unit difference in bilirubin. So, it's the original hazard ratio for a one unit difference raised to the 2.7 power, the difference in x values for the two groups we're comparing. The same thing applies to the confidence interval endpoints. The confidence interval endpoints for the true, not hat, but the true value of Beta at one, so no hat, the true log hazard ratio were 0.124 and 0.176. So, in order to get the confidence interval endpoints for the true value of 2.7 times Beta one, we take each of these endpoints, 2.7 times each of these endpoints, and this would be our confidence limits for the log hazard ratio comparing the log hazard of mortality for persons with bilirubin of 3.5 to persons with bilirubin of 0.8. Then, if we wanted to get the confidence interval for the hazard ratio comparing 3.5 versus 0.8 bilirubin, we would exponentiate these endpoints. But if we did that, e again to the 0.124 times rating at reverse here times 2.7, and e to the 0.176 times 2.7 can be re-expressed as e to the 0.124 raised to the 2.7 power, and e to the 0.176 raised to the 2.7 power. This is just the lower endpoint for the confidence interval for the original slope, and this is just the upper end point. So, we can take those endpoints on the hazard ratio scale and simply exponentiate them to the power and for the difference in x values. So, just to show you, you don't necessarily have to go back to the log scale if you already have the hazard ratio itself, but you are always welcome to do so if you feel more comfortable with that.