Make friends with your data. For me, this quote from Rosenthal means looking and playing with your data to understand your data. And looking at your data means making graphs to visualize your data. Therefore, in this video I want to show you how to visualize two variables and when a certain method is suitable to use. Let's start with a short overview and for that we take a look at this tree diagram. You first ask yourself, is my Y variable numerical or categorical? Next ask yourself, is my X variable numerical or categorical? Let's start with the case that both variables are numerical. The appropriate technique for visualizing the data is a Scatterplot. When the Y variable is numerical and your X variable is categorical, you make a Boxplot with groups or an individual value plot. When the Y variable is categorical and the X variable is numerical you can make a transposed Boxplot with groups. And finally, when both variables are categorical, you should make a cross tabulation or Stacked bar chart. Now I will explain each of these four graphs, starting with the scatterplot. For that, let's use the example of our coffee production again. We were wondering whether the caffeine content is related to the extraction time. The data that we gather for that looks like this. We have a numerical Y and a numerical X variable. Now pause the video, load this data into Minitab before you continue. I have copy pasted my data into Minitab and this is what your data in the worksheet should look like with Measurement in the first column, Caffeine Content or percentage in the second column, and Extraction time in the third column. Let's make a scatterplot. For this we go to Graph, and you find your Scatterplot as the first option here. You ask for a Simple Scatterplot, okay, and then you need to fill in what is my Y variable, and what is my X variable? Let's start with the Y variable. Well that is of course caffeine content, as that is your CTQ or your dependent variable. And then for your X variable, it is extraction time. That's it, okay, and Minitab makes for you, this Scatterplot. The Scatterplot looks like this. You see that the dots are clustered together and more or less form a straight line that goes down. Thus in the data, the caffeine content decreases as the extraction time increases. Scatterplots can also look like this. You see that the first one has a linear relationship with an outlier. In the next plot, we see slightly curved line of dots. This is indicated of a curved relationship. In the next graph, you see that the data points are clustered. And in the final Scatterplot, we show data that is scattered all around and therefore, the variables have no relationship. If we have a Y variable that is numerical but an X variable that is categorical, we can make Boxplot or an individual value plot. Let's go back to our example of coffee. The caffeine content in the coffee is still our Y variable, CTQ. However, now I'm studying a different influence factor. I study the extractor number, the machine number which the coffee is put into. This is a categorical influence factor or X variable. Now, pause the video, load the data before you continue. I have copy pasted the data into the worksheet of Minitab with Caffeine% in the first column, Extractor number in the second column, and Batch number in the third column. Okay, let's make a graph of Caffeine% and Extractor number. For that we go to this menu graph and you can make a Boxplot which you can find here. Or an Individual Value Plot, which you find here. Let's focus on the Boxplot for now. Okay, you have the choice between a Simple with One Y variable or With Groups. So as we want to show it for each extractor number, we select the With Groups, okay? Which is our graphing variable? Well that's of course Caffeine%. Now what's your categorical variable? Well that's Extractor number. That's it, okay. And Minitab makes this Boxplot for you. This is the Boxplot. The lines in the boxes show you the medium caffeine percentage. And the whiskers show you the maximum and the minimum caffeine percentage for each machine. See the video visualizing numerical data for a more detailed explanation on the Boxplot. If you want your caffeine content to always be below 0.1%, which machine is the machine that produces coffee that does not meet this specification? Well, it's machine three of course. If you have a categorical Y variable and a numerical X variable, we also need to make a Boxplot. However, you want your Y variable on your vertical axis so we need to transpose the original Boxplot. Consider this next example of students. We would like to know if the student's math grade in high school affects the likelihood that they pass the first year at university. Well, math grade is therefore your numerical X variable. And whether or not you pass, yes or no, is the categorical Y variable. Now pause the video and load this data into Minitab. This is your data on Minitab with Student in the first column, Math grade in the second column and Passing this first year university in the third column. We want to make a graph of Pass and Math grade. So we go Graph, and we select the Boxplot. Next, we want to have a boxplot with groups because we want a math grade for each group, yes or no. Okay, the graph variable has to be numerical, so we select Math grade, and the categorical variable is passing yes or no. Now if you go to Skill, you can select Transpose which means it will transpose the output, OK. This is our transposed Boxplot which means that we have our passing in our vertical axis and a math grade on our horizontal axis. We see that the students who pass the first year have higher grades than students who didn't pass the first year. The final method that I will show you is if you have a categorical Y variable and a categorical X variable. If this is the case, you can ask Minitab to make a Stacked bar chart. Here we see data with a Categorical X and a Categorical Y. Now pause the video and load the data into Minitab. This is the data that is pasted into Minitab with Patient number in your first column, the Specialist who treated the patient in the second column and the Department that the patient was lying in the third column. Now, we're interested to know if each specialist goes to the different departments equally frequently. For that, we make a graph. Let's go to Graph and you're going to go to a Bar Chart as we have categorical variables. We can ask for a Stacked bar chart to show the different categories, okay? Now, what do we do first? We select Specialist and we select Department and as it says here, it will stack the categories of the last categorical variable, okay? And this is the bar chart that Minitab makes for you with Specialist on the horizontal axis and Counts on the vertical one. This is a Stacked bar chart that Minitab makes for you. On the left axis you see the count which is the amount of times a certain category occurred. Specialist P treated approximately 100 patients. Approximately 80 of them were in department A1, and the others were in department A4. In summary, the appropriate graph to visualize two variables depends on the type of data you have. If both variables are numerical, you make a Scatterplot. If the Y variable is numerical and the X variable is categorical, you can make a Boxplot. If the Y is categorical and the X is numerical, you can make a transposed Boxplot. If both variables are categorical, you can make a Stacked bar chart. And of course, there are many other graphs that I didn't show you here, but that is up to you to discover.