Regression analysis is a very popular technique within statistics. I studied econometrics at the University of Amsterdam in the Netherlands. And during my bachelor's, I had about six courses on regression of the total, say 25-30 courses. That is how popular regression is. And in this series of videos, I will introduce the basics of regression. I will show you four basic steps which guide you through the regression analysis and probably more important to interpret it correctly. In this video, I will introduce regression analysis. And in the next three videos, I will explain the four different steps. When is regression analyses the appropriate tool to use? Our tree diagram shows that when the Y variable's numerical and the X variable's also numerical, regression analyses is the appropriate technique. Let's start with an example. Imagine, you work at a factory that produces tea and you work for the packaging line where the tea is put into tea bags. Sometimes the tea bags break during the packaging process and they become defects. You wish to reduce the number of defect tea bags. You are brain storming with your team for influence factors. You know that different tastes of tea are produced and therefore, the production line is stopped during the day for change overs of type of tea. And this makes you wonder, could it be that there is a relationship between a number of production shops and a number of defect tea bags? Therefore, you collect data on both these variables. To do study the relationship, you first look at the graph of this data. As our Y variables defect bags which is called bags in the data set and our X variable stops, and because they are both numerical, you can make a scatter plot to visualize the relationship. Let's study the scatter plot. Do you see a relationship between stops and defects? On first sight, it looks like the number of broken teabags increases as the number of production stops Increases as well. We can visualize this relationship by drawing a line through the data points. But which line is the best line? Is it this one? Or this one? Or maybe this one? Regression analysis is about finding the best fitting linear line for your data. A line can be described mathematically by the formula y = a + bX. Regression analysis means finding the a and the B that give you the best fitting line. Regression analysis consists of four steps. The first step is making a fitted line plot to visualize your data. The second step is to perform the main regression analysis. At this step, you look whether the relationship is statistically significant, and if so, how strong the relationship is. The third step, you perform are residual analysis. I will explain what residuals are and you will learn that the reliability of the regression analysis depends on this step. In some cases, you perform regression analysis, not only to study significance but also to do predictions. Therefore, we have added the optional fourth step of constructing a prediction interval. How you should perform these four steps will be discussed in the next videos. In summary, regression analysis is a statistical method to test if two numerical variables are related by calculating the best fitting line. Regression analysis consists of four steps. First, you make a fitted line plot to visualize the data. The actual calculations take place in the second step. The third step Is about performing a residual analysis. And in the fourth step, you can calculate the prediction interval.