Hi everyone and welcome to our lab, where we're actually going to make some scatter plots, and show you how to get these statistics R and R squared that we've been talking about in prior videos. So what I have in front of me is a nice spreadsheet here. Just to go through the columns, we have TV, radio, newspaper sales. This is money that is spent on advertising in TV, radio and newspapers. And then these are the sales that result from them. So you can assume these are in thousands of dollars, millions of dollars, doesn't quite matter. When you get a spreadsheet though, it's usually best to sort of go through see how large it is. This particular one has 200 rows. We're on row 201 cause I have a header row here. So nice big spreadsheet, not the largest one that I've ever seen, but certainly not the smallest one as well. So I am using a PC Microsoft Excel, whatever version I'm on, I actually don't know the number, but yours might look different. Maybe you're doing a different software package, maybe you're using something in the cloud, maybe you're doing just some different version of Excel with Excel though, and there's many tools like scatter plots is just one of them. It'll be similar where your stuff is. And then maybe as soon as I record this video they will update the look and feel, rendering this video sort of obsolete. But it is you can find the features, the functionality, the button, names won't change. So you can always find out how to do this. All right, so let's start with everything. So what you want to know is, does TV correlate the sales, does radio correlate the sales, does newspaper advertisement correlate to sales? And for this lesson we're going to do things one at a time. So we can do a couple of things. So you can highlight two separate rows, separately. The way you do that is always hold Ctrl+Down and highlight these things. And then you go to Insert, go to the Insert tab. And there's all these recommended graphs and stuff like that. I don't usually like looking directly into graphs cause who knows what they're going to recommend, but in this particular case, we want a scatter plot. So we're going to go and find our scatter plot, which is, the scatter button here. And we get a bunch of options appear on the top, and you can see we have a nice scatter here. So a couple things with these options, a lot of them are just formatting like font, and dot size, and if you want a black background, so now it looks like a nighttime sky. I don't recommend these, if you work for a company, and you're doing this for a client or somebody like that. They'll have a style guide, they'll have a preferred way to do it. They'll tell you what colors to use. So, all I care about, we'll just pick the first one just for our sake here. One thing that beginners tend to sort of not realize is that when I click on the chart I have all the chart menu options. When I click off of the chart, if you notice, the tool, the tab, where'd it go? It's gone. Because Excel assumes since you're clicking outside of the chart, you're done working on the chart. And so to try to make your life easier, it removes the tab with the options to work on the scatter plot. If you want to click and you want to see it, so there's the Chart Tools that appears, click off it, it goes away and click on the slide, it appears again. We're going to watch out for that. A lot of students will have the mouse over here and then say, but I don't know how to change the thing or whatever. So, when you are inside of it, the menu will appear. Okay, let's update the title. Double click on the title. Excel's pretty bad about giving the title so let's do TV for sales, right? So always have a good title. If you want to insert x and y axis label, you probably should, so we can add a chart element. That's again clicking on the chart >add chart element and you see many many options here. One of the options is axis title, primary horizontals is your x one. Let's do it again, we'll add a vertical one, so our title here. So our x axis is our first, this is our TV. And our second one here, you get to practice writing sideways, this is Sales. You'll know when it's graph, okay. So let's look at the graph. Just from a general visual standpoint, the data points seem to be going up into the right. So, it seems to have a positive association. It would make sense, just from common sense, the more money I spent on TV advertising, the more sales that I would get, you hope that that's the case. So you start seeing this sort of shift in trend up to the right. Now let's get some more statistics for this thing. How do I add my best line, my linear regression line? Click on the chart >chart element, and one of the options is trend line. You're going to add a linear trend line. And there it is. Now of course, they put it in blue and the dots are blue. So maybe we can play around with it and change the color. How do you adjust something in Excel? Click on the item and the menu tends to appear once you click on that item. You can see colors here, we can change it. Let's pick something kind of gross so it stands out. How about purple? So better, maybe, I don't know, pick your favorite color. The point is this one, you'd be able to see it. And maybe we could even make the width a little bigger here as well. Okay, and let's do a solid line. So we're going to follow this and then play around with these things, Dash type. Fancy. All right, maybe a little better to see. You can always resize the graph too, to make it as large or small as you want. So I've updated the title, and in this we have a line. If you notice, though, while it draws the line, it doesn't give us the equation. You have to go get the equation. Okay, so how do we do that for the equation? There's a few ways to grab it. And if you click outside the chart and add a chart elements, so we're adding something to the chart. So this is, going to come to the add chart element. Trend line is here. And then we're going to grab more trend line options, we're going to get not just the line, we want more. And once you're on this chart to the right, you have three options here. There's like paint bucket, which gives you some color options that we're playing around with line. This is even for more effects on shadow glow, soft edges. And then there's like a little histogram here that you can click as an icon, and you get some different options. One of the options down here at the bottom, at least for me, it says display equation on the chart. If we click that, it puts it there. Now of course for no good reason it puts in a spot that's a little difficult to see but it's there. So I like to move it up somewhere where it's a little more clear background. I don't know why, it always chooses like the most awkward place read this thing. But there it is. So now you haven't. So y equals 0.0475x plus 7.0326. That is good old mx+b. And that is the equation of this line. And you can use that line, you can use that line to now make predictions wherever you don't have a data point. So for example, if I were to spend $250, like it's on the list. Well I can plug it into this equation and get an output for sales that start making predictions. What have we spent this much? What should we expect? What do the sales back? Remember we want to stay within our range. Based on this scatter plot here, we have $0 all the way up to 300, so any picking number between 0 and 300 is perfectly fine. Anything outside of 300, I wouldn't really bet your career on those predictions. It's a risky bet, that's called extrapolation. We want to stay within a range of our data. One other thing that we talked about, sometimes we say, well, how strong is their way to measure how strong this correlation is? We can look at our linear correlation coefficient. So how do I add that in there? More so click on the graph >Chart Tools option, and we're going to add a trend line. We go back to our trend line options and go back to the histogram. And we're right underneath where it said display the equation on the chart, we can display the r squared value on the chart. That's another just checkbox. Of course, it puts it in a sort of awkward place. And we can move it to somewhere where it's a little more readable. So this is r squared = 0.6119. So remember how to interpret this number. This gives you a sense of how good the model is. This is saying that 61% ,usually talks about 2%, 61% of the variance and y Is explained by x. So 61% of sales is explained by TV ads. Which says that there's 40% of something else that is contributing to sales that we don't really know about. But the variance, the change in y, the spread of y is from r squared. So the model is pretty good, 0.6 If you want to look at that's r squared, you can also get R if you want, you can calculate Excel doesn't give this to you for free, you can calculate it. Let's just create a cell down here with R equals. And you can take the square root of this number, because you know in Excel to do with an equation you start off with equals. The square root is SqrT, squirt. Very cute. So equals SqrT and then you can just type in. So of course, you can also grab a calculator and just find the square root of this thing but 0.6119. And this is just acting like calculator if you do square root of 0.6619. So our R value gives you more decimals than you'd ever need. We can shrink it using one decimal. It's about twenty. Remember that R falls between one and negative one. The closer that we are to one, the better the linear correlation is, the stronger you can feel about your prediction. So this is saying that yeah, if you increase TV, then it's a pretty good predictor that sales will also rise. They're strongly correlated pointing, it's pretty darn good. And you can tell from the picture, these points tend to follow a pattern, they're close to the line. And obviously if you spend a little more at some point, you going to get as much bank for your buck. But some people in marketing would use this information and decide. And they could monitor when the marketing campaign is sort of reaching its peak, or ready to start a new campaign. Let's go through the same process, let's do one more. So I'll put this guy aside, move this over here, and let's do another one for newspaper and sales. Newspaper, what's a newspaper? So you advertise in newspaper, they're still out there, they exist. What's the correlation of sales. So let's do it again. These two are next to each other. So I'll just highlight the two. And let's go a little faster. This time, I will insert a scatter plot. Well, look at these crazy charts. So a scatter plot is here, and I'll produce it right on the screen. And this one here you can tell it has a little more random pattern. It sounds a little more random. It's not as nice a pattern as we saw before. Whenever you get a chart, always update your title so you know what you're looking at. So this is newspapers, marketing or sales. We can also update our x & y axis. So we go to axis title, primary horizontal. Let's add both the horizontal and the vertical, the x and the y. So your x axis is Newspaper's. Your y axis will be your Sales. Now remember one of the properties of R and R square is that you could switch these, It doesn't matter. Whenever you're highlighting a couple things, if you're not sure which one is which, one of the options at the top that you're going to select, right click and get it, just select the data. It tells you, you can see which is your row and which is your column. You can also probably look at your graph and see, pick up what it's doing. But when you highlight the leftmost, it becomes your x axis and the rightmost becomes your y axis. So we have our scatter plot. Again, compare the difference here. You can see the difference in pattern or lack of regular pattern. And perhaps that's not surprising. If you put a tonne of money in newspaper sales given today's day and age, it might not be as predictable, who knows? So let's go through and let's sort of see that from the data. Let's add first our chart element of a trend line. So we'll go down to add trend line >more trend line options >click on little histogram and we're going to display the equation chart. So let's go back to add a chart element >trend line >more trend line options. We'll add our square. Your Excel, your whatever using the look and feel a little different. But hopefully you can find all these things and you may just have to search online for how to find these particular, like where the buttons are the check. This is the hardest part. Every program will give you the equation. Every program will give you R squared. But how does it work, where the buttons are, that on the left, on the right, might be a little depending on the version and the OS you're using. All right, so here's our line. So like, computer stupid, it does whatever you tell it to. I wanted it to find for me an equation of a line that fits the data. Sure, no problem. Let's do the same thing. Let's not make the color match the dots just so we can see it. And we can make it a little thicker. And let's change the line to solid 7 option there, okay. So you can see the line. So it does have a small positive correlation, right? There is a positive slope to this line. The line is telling you that goes up a little bit. We have some outliers here that are probably affecting the data, but it is positively correlated. So if you put some money towards a newspaper ad, you will get a positive increase in sales. However, clearly, from the data showing not as much. You can see the slope of this line is 0.0547, and the slope for this line is 0.0475. So we have these different values for our slopes. The intercepts we don't really care about, don't quite care about as much. The r value is also similar but it's less, right, so this is 0.05, this is the difference here. And r squared for TV sales was 0.6. So we had 61% of the variance of the y value, the sales was explained by the TV marketing. And here, look at how small this is 0.05, 5%, so newspaper, the explanation of what's his contribution the variances is 5%. Very small. And if you want to find what r is equal to, so what is r equal to? Let's just give a little label here. So r equals the square root, I'll use excels a calculator 0.0521. And that's 0.2, more decimals than you'll ever need. So let's just say 0.2, remember the closer you are to zero, the worse your linear relationship is. And so TV vs Sales, Newspaper vs Sales. So you can see that TV, and perhaps, not surprising, this gives you a way to measure this. This gives you a way to actually put some numbers behind your intuition to do some comparison. So it's not an opinion anymore. But I think TV will be better than newspaper. It's not like here's what it is, here's how much. Here's the R value 0.8 versus 0.2. Clearly, there's a stronger linear correlation between TV and sales versus newspaper sales. And it's not just sort of guessing. Sometimes the data, it matches what your intuition would say and sometimes it doesn't, so it's best to go through these. I won't close this spreadsheet so you can play around with it. And you can play around radio as well, or you can start mixing, matching and making some different graphs. But I just want to show you how to do scatter plot, how to get the equation, how to get r squared and how to find r using Excel. There's one other software that I want to show you. And it's web based software. So if you don't have Excel or something like that, don't panic, you can still go through. So let me take the data here and I'll highlight this. There's a faster way to do this and just drag it but for now, so if you open desmos.com, desmos.com is one of my favorite other graphing tools. You can paste, so literally highlight all your data in Excel and paste it in. Just literally, Copy >Paste, use the keyboard shortcuts. It doesn't carry over the header, so it doesn't call sales and TV but so you have to remember what they were. So a couple things it does, when you paste in here and it only gives you like a preview of the first seven and last four rows, it hides all the rows, it sets the safe space on your screen. If you want to show all the rows, you put the show all and it also uses variables for your names. So it only knows variables as letters. It doesn't, if you give the word sales, so if you come in here and type sales, it'll break, just like I'll show you. If you come in and type sales, it's like, that's nice, well, sales. It's like I don't know what that means. But if I just had s or something, it's happy with that, we can live with that. So you can rename these variables, from the defaults, whatever you want them to be. x is just a real number, that's not very useful. But y, z, u, so you get a scatter plot. You can also click on the colors, just to see a couple things. If you want to see one at a time, one of the data series at a time that helps you visualize the data. You can zoom in, you can zoom out, you can resize the graph. So pick the right window, play the role of the photographer and capture the right picture. And so let me just show you how to. So let's do the same thing we did before. So I'm going to turn off radio and newspaper and we're just going to work with the first and last column. This is our TV and sales. So let's graph. Let's do this. So here we go. So I want to do a linear regression on sales. So, the way you do this, Desmos is you do s, s is all these variables. Use a Tilda. So the Tilda as the LME that was that already to the right ,but Tilda is what you use when you want to say I want to get a linear regression. So you can call it a variable you want m. We're going to do y1, so y1. So,you have to type all this in plus so, you're creating the variables mx + b here. And once you start typing these in, it'll find it's immediate right? As soon as you type it. If I stub for you 0.0473566 it finds B Your intercept. And it does you a favor for you, it computes r squared and it computes r for you. And it gives it to you for free. And now you can see that this matches the other two but you have to type this out. And the key here is if you put equals it doesn't know what's going on. It thinks this is an equation. So you really have to put the Tilda, when you try to find regressions. It graphs it for you as well, you can see the line down here. And if you want to see that line, you can turn the line on and off, you can resume the picture. So it gives you all the same information and sometimes just depending on the tool you're working with, you might want one or the other one. But you can get all this for free. If you want to do sales in terms of newspaper that was a third column Which decimals tells you, I'll change it to you. And you can see r squared and r and all the parameters that go with it. This is just two of many, many software that gets you these numbers. The formulas behind these are pretty rough, I would never want you to do it for 189 more rows, for 200 rows. But you should know the formula we had it before but let the computer do all of it. Your job is to interpret the numbers and make sure that they make sense and then report back and understand the relationship. All right, so that was Excel. This is Desmos. Have fun, make some pretty charts and interpret your numbers correctly. Great job on this video, I'll see you next time.