Analysis of Variance involved examining the relationship between a Categorical Explanatory variable and Quantitative Response variable. Next, we're gonna consider inferences about the relationships between two categorical variables. The statistical test that will answer this question is called the Chi Square Test of Independence. Chi is a Greek letter that looks like a large X. So sometimes, you'll see this test denoted with an X squared. For this statistical tool, let's start with a new example. [MUSIC] In the early 1970s, a young man challenged an Oklahoma State Law that prohibited the sale of 3.2 beer, males under the age of 21. But allowed it's sale to females in the same age group. The case was ultimately heard by the US Supreme Court. The main justification provided by Oklahoma for the law was traffic safety. One of the three main pieces of data presented to the court was the result of a random roadside survey that recorded information on gender. And whether or not the driver had been drinking alcohol in the previous two hours. There were a total of 619 drivers under 20 years of age included in the survey. On the left, you can see what the collected data look like. And on the right, is a two way table summarizing the observed accounts in the roadside survey. Our task is to address whether these results provide evidence of a significant or statistically meaningful relationship between gender and drunk driving. Both variables are two valued categorical variables and therefore our two way table of observed counts is a two by two. The Chi Square procedure is not limited to two by two situations. It can also be used for a larger number of explanatory categories. The key to reporting appropriate summaries for a two-way table is deciding which of the two categorical variables plays the role of the explanatory variable. And then calculating the conditional percentages separately. That is, the percentages of the response variable for each value of the explanatory variable. In this case, since the explanatory variable is gender, we would calculate the percentage of drivers who did and did not drink alcohol for males and for females separately. Here's the table of the conditional percentages. >> For the 619 sample drivers, a larger percentage of males were found to be drunk than females, 16% versus 11.6%. Our data in other words, provides some evidence that drunk driving is related to gender. However, this in itself is not enough to conclude that such a relationship exists in a larger population of drivers under 20. >> We need to further investigate the data and decide between the following two points of view. That there's no difference in the drunk driving rate between males and females under 20, our null hypothesis. Or that there is a difference in the drunk driving rate between males and females under 20, our alternate hypothesis. In other words, is the evidence provided by the roadside survey, 16% versus 11.6%, strong enough to conclude beyond a reasonable doubt that it must be due to a relationship between drunk driving and gender in the population of drivers under 20. Or is the evidence provided by the roadside survey not strong enough to make that conclusion? And could this have happened just by chance? That is due to sampling variability and not necessarily because a relationship exists in the population. These are the null and alternate hypothesis for chi-square test of Independence. Here are other ways that the null and alternate hypothesis can be stated for a chi-square test of independence. There is no relationship between the two categorical variables. They are independent. Or, there is a relationship between the two categorical variables. They are not independent. Algebraically, independence between gender and driving drunk is equivalent to having equal proportions of who drank or did not drink for males versus females. In fact, the null and alternate hypothesis could be reformulated as the proportion of male drunk drivers is equal to the proportion of female drunk drivers. Or the proportion of male drunk drivers is not equal to the proportion of female drunk drivers. The idea behind the chi-square test of independence, much like the analysis of variance is to measure how far the data are from what is claimed in the null hypothesis. The further the data are from the null hypothesis, the more evidence the data presents against it. Here, the gender and drunk driving data are represented by the observed counts. To represent the null hypothesis, we're gonna calculate another set of counts. The counts that we would expect to see, instead of the observed ones. If drunk driving and gender were really independent. That is, if the null hypothesis were true. For example, we actually observed 77 males who drove drunk. If drunk driving and gender were really independent, if the null hypothesis were true, how many male drunk drivers would we expect to see instead of 77? We'll also ask the same kind of question about the other three cells in our table. If the null hypothesis were true, how many female drunk drivers would we expect to see instead of 16? How many non-drunk driving males would we expect to see instead of 404? How many non-drunk driving females would we expect to see instead of 122? >> In other words, we will have two sets of counts. The Observed Counts, that is the data. And the Expected Counts, if the null hypothesis were true. We will measure how far the observed counts are from the expected ones. We will base our decision on the size of the discrepancy between what we observe and what we would expect to observe, if the null hypothesis were true. How were the expected counts calculated? >> If events A and B are independent, then the probability of A and B equals the probability of A times the probability of B. We use this rule for calculating expected counts one cell at a time. Applying the rule to the first top left cell. If driving drunk and gender were independent, then the probability of being drunk in male is equal to the probability of being drunk times the probability of being male. By dividing the counts in our table, we see that the probability of being drunk is equal to 93 divided by 619. And the probability of being male is 481 divided by 619. So the probability of being drunk and male is 93 divided by 619 times 481 divided by 619. Therefore, since there are a total of 619 drivers. If drunk driving and gender were independent, the count of drunk male drivers that we would expect to see are the following. Circled here in red. So the formula for calculating Expected Counts is Column Total times Row Total divided by Table Total. Following this formula, here are the complete tables of Expected and Observed counts. Importantly, the single number that summarizes the overall difference between Observed and Expected Counts is the chi-square statistic denoted as chi or x squared. Which tells us in a standardized way, how far what we observe, that is the data is. From what we would expect to observe, if the null hypothesis were true. Here's the formula. For each cell, we take the Observed Count, subtract the Expected Count and square that value. This value is then divided by the Expected Count and then this number is summed for all of the cells in the table. Once the chi-square statistic has been calculated, we can get a feel for its size. Is there a relatively large difference between what we observe and what the null hypothesis claims? Or relatively small one? It turns out that for two by two case like ours, we're inclined to call the chi-square statistic large if it's larger than 3.84. Therefore, our test statistic is not large indicating that the data are not different enough from the null hypothesis for us to reject it. For cases other than a two by two, there are different cutoffs for what's considered large, which are determined by the null distribution in that case. Thus, we're going to rely only on the p value for conclusions. Even though we can not really use the chi-square statistic, it was important to learn about it, since it encompasses the idea behind the test. The p value for the chi squared test of independence is the probability of getting counts like those observed, assuming that the two variables are not related. Which is what is claimed by the null hypothesis. The smaller the p value, the more surprising it would be to get counts like we did, if the null hypothesis were true. Technically, the p value is the probability of observing a chi-square at least as large as the one observed. Using our statistical software, we'll find that the p value for this test is 0.201. The p value of 0.201 is not small at all. There's no compelling statistical evidence to reject the null hypothesis. And so we'll continue to assume it may be true. Gender and drunk driving may be independent. And so the data suggests that a law that forbids sale of 3.2% beer to males and permits it to females is unwarranted. In fact, the Supreme Court, by a vote of seven to two majority struck down the Oklahoma Law as discriminatory and unjustified.c