Confounding can lead to biased estimates and produce misleading results.

Therefore, it is something that we should know about when designing,

conducting, or critiquing a study.

But how can we know if this confounding?

There's no statistical test for confounding,

that are of course statistical methods that can help us make an informed decision.

But it depends largely on our judgement.

In this lecture, I will share with you

four commonly used ways to identify

potential confounding factors in an epidemiological study.

Let's consider an example of a study which aims to investigate

the association between dog ownership and mortality among the elderly.

Some previous studies have found that owning

a dog can be associated with higher life expectancy.

One straightforward way to identify factors that could confound this association,

is to explore the literature.

Knowledge of the subject matter can

heavily influence our decisions regarding confounding.

For example, if other studies have shown evidence that the size of the city

where people reside is a confounder in

the association between dog ownership and mortality,

we have every reason to consider it as a confounder in our study.

Knowledge of plausible biological pathways can similarly help us identify confounders.

However, this is not always possible,

especially when we explore novel associations for which prior research is scarce.

In such cases, we can examine whether

the variable of interest satisfies the following three conditions.

It is associated with the exposure in the source population,

it is associated with the outcome in the absence of the exposure,

and it is not a consequence of the exposure.

In other words, it is not in the causal path between the exposure and the outcome.

If we stick to the same example of dog ownership,

our exposure, and mortality,

our outcome, and we would like to explore whether age may be a confounder,

we would need to answer the following questions.

Is age associated with dog ownership among the elderly?

Is age associated with mortality among those who do not own a dog?

Is aging in the causal path between dog ownership and mortality?

We can only respond to the first two questions when we analyze data from the study.

But let's assume that age is associated with both the exposure and the outcome.

The answer to the last question is obvious here,

owning a dog cannot change your age.

So, age is not in the causal path.

Age satisfies all three conditions.

Therefore, we identify it as a confounder in this study.

A different way to think about this is to stratify data by the variable of interest,

which is age in our example,

and compare the stratum specific estimates with

the estimate that we get when we analyze the entire set of data from the study.

In our study, we will need to split our sample by age,

below 80 and 80 and above for example,

and calculate the odds ratio in each subgroup.

We might find that owning a dog reduces mortality by 40 percent among

those below 80 years old and 38 percent among those at least 80 years old.

But when we analyze the entire sample together,

we could find that owning a dog only reduces mortality by five percent,

which, of course, doesn't make sense when you consider the stratum specific numbers.

When the pooled estimate is considerably

different from what you would expect based on stratum specific estimates,

it is very reasonable to think that there is confounding.

Lastly, the fourth way to detect confounding that you might come across.

Let's say we use a simple logistic regression model to estimate

the crude odds ratio that expresses the strength of

the association between dog ownership and mortality in our study.

When we include age in the regression model,

we estimate the adjusted odds ratio,

adjusted for age in this case.

If the adjusted odds ratio differs from the crude odds ratio by 15 percent or more,

this may indicate confounding by age.

This number is arbitrary and may not always reflect true confounding.

It could be that we introduce confounding by adjusting for an additional variable.

This is not the optimal method to identify confounding but can

sometimes flag occasions where further investigation is required.

My experience is that people often assume that they need to use all these methods.

So, let me clarify this.

You only need one of the above methods to identify confounding.

If you can make a decision based on your knowledge of the subject matter,

you don't need to stratify or explore whether the three conditions are satisfied.

In conclusion, there are multiple ways to think about confounding.

But at least to some extent,

we need to use our judgement to decide which factors may cause confounding.

As you will see, this is a critical decision

because it will inform the design and data analysis of our study.