Random assignment is another way we can control for these other factors.

The idea is that if every observation in the sample has an equal probability

of being in each of the groups, and truly, randomly end up in one group or

another, then the groups end up balanced in terms of the other factors.

So if age is a factor, then the group should have the same age variability and

this equal variability essentially controls for that factor.

And this should be the case for

any other factor, however randomization doesn't always work the way we want it to.

In fact randomization works best as your sample size approaches infinity.

Unfortunately we work with finite samples, which can often be pretty small.

The smaller the sample the greater the risk that the groups will be unbalanced on

factors that could affect how the treatment affects the response variable.

If part of your job as a data analyst is to evaluate data from studies with

random assignment, one of the first things you'll wanna do is to check for

any imbalances between your treatment and control groups

on key variables that could change how the treatment effects the response variable.

If imbalances are identified, then those variables can be included in

the statistical model to predict the response variable, so

that they can be statistically controlled.

Statistical control is another commonly used strategy.

If we include additional explanatory variables that could effect

the association between the treatment and the response, than we could examine that

association after adjusting to the other explanatory variables.

Well, these are all good strategies,

from posing as much control on a study as possible.

They're not perfect.

Nor can we possibly control for everything that could affect the association between

the treatment and response variable.

For that reason, unlike a true experiment in which we are able to hold

every other possible variable constant, we cannot determine causality.

We can only determine whether the treatment is associated with

the response variable.

Sometimes, we can't randomly assign people to a treatment or control group.

In many cases, it would be unethical to do so.

For example,

if we're conducting a study to examine the association between cocaine use and

memory processing, there's no way we could assign some participants to use cocaine.

This would be completely unethical and

we put our participants at significantly greater risk of harm.

It certainly would not outweigh the benefit of any knowledge that would be

gained by the study.

Instead, we would have to identify people who either test positive for or

self report, cocaine use and

then test for memory processing differences between users and non-users.

The manipulation of the explanatory variable is based on the fact

that our treatment and control groups are pre-selected.

In this study, cocaine users would be in our treatment group and

non-users would be in our control group.

So while it looks like an experimental design, it is missing the random

assignment piece, and we call this a quasi-experimental design.

We can increase the rigor of a quasi-experimental design

by measuring as many confounding variables as possible.

Having a control group and using a pre and post-test design whenever possible.

A quasi-experimental design will not allow us to infer causality

between an explanatory variable and our response variable.