0:09

Hi. In this video we're going to talk about an alternative criterion,

the disjunctive cause criterion.

So the objective is to understand what the criterion is and given a DAG,

how to use it to identify a set of variables to control for.

So an alternative criterion to

the back door path criterion is what's known as the disjunctive cause criterion.

And this was developed by Tyler VanderWeele in 2011.

So it's simpler than the back door path criterion in many ways. So here's what it is.

All you have to do is control for all direct causes of the exposure,

the outcome or both.

And that's really all there is to it.

So we actually would not have to write down the whole DAG,

you wouldn't have to write down the whole causal graph,

you would just have to list variables that affect either the exposure,

the outcome or both.

So we call it the back door path criterion.

There were a lot of rules that we had to

know in terms of when is a back door path blocked?

How to deal with colliders and those kinds of things.

So that path, or that criterion,

required quite a bit of knowledge about causal graphs,

whereas this one is much simpler to explain,

to understand and it also does not require knowledge of the entire graph.

So here's an example.

So again, we're interested in the exposure A and

outcome Y and we want to block confounding.

In this example, there's one direct cause of

A - so there's one direct cause of treatment - and that's V. So remember,

a direct cause really means a parent in this case.

So A has one parent,

which is V. So there's one node that's

directly causing treatment or directly affecting treatment.

In this case, there's also one direct cause of Y,

which is W. So Y has one parent;

well, Y has two parents, but one is treatment.

So we're interested in all direct causes of Y that is not treatment,

because that's what we want to block.

There's no direct causes of both; so in this case,

we would have to control for V and W based on the disjunctive cause criterion.

So it has a unique solution and in this case,

it's just control for both V and W.

So here's a second example.

And remember in this one,

there's one back door path and there's a collision at M,

so we actually wouldn't have to control for anything.

But if you're using the - if you're using the disjunctive cause criterion,

what you would control for is the following.

So first, you would look for direct causes of treatment.

So there's one direct cause of A,

which is V. There's one direct cause of Y,

which is W. And we should therefore control for both V and W.

So you'll notice in this example - and if we go back,

if we look at this example - in this example,

V causes A, W causes Y.

In this example, again, V causes A,

W causes Y - but the graphs are different.

Yet in either case,

it's sufficient to control for V and W. So if all

we knew was that V is the only thing that caused A and W is the only thing that caused Y,

we would get the right answer in terms of what to control for even if we didn't know

whether graph in example one was correct or the graph in example two was correct.

So either of those graphs could be correct and as long as

we get right which variables cause - directly cause A and Y,

then we'll control for the right variables.

So that's an important distinction with

this criterion is you don't need to know the whole graph,

just what are the direct causes of V - I mean of A and Y.

So here's another example.

So here there are actually two direct causes of A;

so W directly affects A, as does Z.

So Z and W are direct causes of A.

As far as direct causes of Y,

it's only V. So based on the disjunctive cause criterion,

we should control for Z,

W and V. And this example was one we looked at when we looked at

the back door path criterion and we saw that one of the valid sets to control for was Z,

W and V. It happens to be the largest set in this case,

but the disjunctive cause criterion

will always give you a correct answer if there is one.

And it doesn't require knowledge of the whole graph.

But it does not necessarily give you the smallest set.

Okay. So we'll look at another example.

In this case there are two direct causes of A - Z and W. Now,

there are two direct causes of Y - V and M. And therefore,

we should control for Z, W,

V and M. So hopefully by now

you're beginning to understand

the criterion and that it's actually very easy to apply in practice.

And in fact, we would - if you were going to use this criterion,

we wouldn't even need to write down a graph.

If you are collaborating - if you're a biostatistician or a statistician and

you're collaborating with a researcher who is expert on the subject,

you would ask them, you know,

what do you believe are the direct causes of A?

And what that would really mean is if you're talking about a treatment,

what variables affect the treatment decision?

And then when it comes to direct causes of Y,

well what are the risk factors for Y?

A lot of times there's a pretty big literature

about risk factors for a particular outcome.

So you would put that information together and that set of

variables would be what you would want to control for.

I'll look at one more example and all I

did - if we look at the previous example - in this one,

all I did was add - basically what I did was add some more nodes to the graph.

And the reason I wanted to do that was to show you that even -

you can make the graph much more complicated,

but really, the thing you need to hone in on is just the direct causes of A and Y.

So even though this looks very complicated,

all we really have to do is look to see who are the parents of A; in this case,

it's just Z and W. And we also want to look for the parents besides treatment of Y.

So there's two direct causes of Y,

which are V and M and therefore,

we would just control for Z, W,

V and M. So even though this graph looks complicated,

deciding which variables to control for is relatively straightforward.

So as an overview with this criterion,

a drawback of it is we do not typically select

the smallest set of variables that you could control for.

So that's the main drawback,

because you're typically going to end up

controlling for more variables with this approach.

However, it's conceptually much simpler,

as I've talked about a fair amount.

So it's simple to explain,

it's relatively simple to implement and - but the drawback is,

you might end up needing to control for a larger set of variables.

So where we're going to next, then,

has to do with what to do with these variables.

So now we know how to identify a set of variables that we want to control for.

So we've talked about blocking these back door paths.

So what that means is we basically know how to

identify a set of variables that we need to control for,

but how do we actually control for them?

So there's many approaches that you could use to control for variables.

One of the most common methods is regression modeling.

So that's something we're not going to spend time on,

but that is one approach that,

under certain assumptions, can lead

to estimates of causal effects or valid estimates of causal effects.

We're going to focus on a couple of methods that are more

directly geared towards causal inference.

So one is matching and another is inverse probability of treatment weighting.

So these are two different methods that you can use to control for

confounding so that you can ultimately estimate causal effects.