Hey folks, welcome back to our module four Seaborn, a package for data visualization. In this module, we're going to learn how we can import Seaborn and how we can plot different types of graphs using very simple Seaborne functions. And then we're going to have a better understanding of how to visualize our data set. Okay, let's get us started. In this module, we're going to play with Seaborn as our package and we're going to use a built-in data set in the Seaborne as tips. So first thing we need to import the needed package and we have done that multiple times. We need to import numpy as np import, the pandas as pd, and the last module we learned import matplotlib.pyplot as plt. And this module we will need Seaborne. So we're going to import Seaborn as sns. This sns is another naming convention. So when you see this sns, when you use this sns, it will be assumed it stands for Seaborne. And in addition to import this packages, we're going to also load the built-in data set of tips in Seaborn to tips as our data sets to play from. We're going to execute this importing and also loading the tips. These tips will actually be a data frame and when we call tips.head, it will print out the first five lines as we did in pandas. And we can see that, okay, tips actually has several columns. We have the total bill, we have the tip, we have the sex, we have the smoker, we have the day, time and the sides. And we can even print out the tips to see more details. And we can see that we have seven columns represents the seven different attributes we have for the record. And in terms of the number of records, we have 244. So basically it is a very small data set. However, this small dataset can be very useful to learn how to do the data visualization. And a little bit more explanation for the columns, the total bill is the record for what is the amount the customer paid. And the tip is the extra money the customer paid for the service, and then the sex and smoker is some information about the customer, and the day and the time is the information about when the meal is ordered and the size is how many people in that order. So those are very simple attributes for a restaurant to keep record of the business. Let's look at the big picture of the Seaborne. So basically we have five different plots. And we learned in math plot lb, we used a default ones. We use the scatters, we use the bar, we use the histogram etc. And in seaborn, we're going to categorize or group those plots in relational, distributional, categorical plots. And we can combine some plots together to have a joint plot. And we can also do a pair plots to have a pairwise comparison of the columns. So those two are advanced plots. The basic category basically is we have a relational, distributional and categorical and we're going to cover them one by one in this module. The first one is relational. Relational plot is similar to the one we had for the plt.P-L-O-T plot. Basically it shows how two columns are related and one of the reasons I like Seaborn is it follows the same iteration of how we can pass the parameters, how we can pass the arguments to the parameters. And here we use the keyword, passing arguments consistently so it can remind us what is this function doing and what is the argument we pass it to this function. For example, for the first one we can have rel which stands for the relation plot and we call sns.relplot. Then we just need several arguments. The first one is what is the data, and we need to specify it is our tips. And what is the column for the X axis? What is the column for the Y-axis? So for here, which can simply just specify the name of the column rather than like we did in data frame have to call it the frame with the brackets. So this one will print out a nicely, if you can remember that is exactly a scattered plot in mathplotlib. However, the advantage of Seaborne is when you call this relation plot, feed those data x and y's, you see that the label total bill and the tip is automatically created for you based on the name of the column. So if we have a tabular data, then Seaborne definitely can help you save a lot of time to specify the labels of each of the axis. And another thing like for the Seaborne is we can easily differentiate different data in the same plot. So for here we have the data equal to tips x is the total bill, y is the tip and then we have a H-U-E. This indication is how we can make differentiation. How we can separate the data points. And this hue equal to sex is simply we are going to show these two types of customer using different color. As you can see here the total bill and tip is automatically added for you, and this legend. The blue dots represents mail and the orange dots represents female is also automatically done for you. You only need to just specify hue is sex, and you don't have to specify the legend and you don't have to plot it twice for two types of customer. Everything is just simply done by feed, and other specified argument. We can do this multiple times to try different separations and we know for the sex, we have only male and female, for the days we have multiple days. How about we hue the day. We find out in our data set we actually have only four days, Thursday, Friday Saturday and Sunday. And in these four days, we can see that they are having different relationships. And that we can easily find out this colorful plot is all types of days with the labels added and this is quite informative. And we can hue something else, for example, for the lunch and dinner so that we can see we tend to have a large bill in the dinner time and also a large tip in the dinner time. And they are all kind of close to each other when the bill is small and medium and the tip is small and medium as well. And also we can have another specified argument which is C-O-L, stands for column. If we want to separate things and keep them side by side comparison, we can easily do that just to simply call the column equal to time. And then we can keep all the x's and y's and hues without change so that we can see that the smoker, the blue ones stand for smoker and the orange ones stand for non-smoker. And then we can see the total bill and tips are nicely separated with two plots represents time as lunch and the time as dinner. So in this simple col, with everything else stands the same. Just feeding another column equal to time, we can achieve a much different without, and this one definitely can help you to differentiate a lot of information. And also we can try the size as our definition criteria. We know that we have multiple sizes. The size is one, two, three, four, five and six. And Seaborn is smart enough to detect that this value of size may in some order from one to and increase to the six because there is an order in this value. Then Seaborne automatically converted into a hot map similar style. So that you see that one is the least dark and six is the greatest dark in terms of the grayscale. And you can see that this is a similar color map we learned last time in mathplotlib, but we don't have to specify that much information which is to simply call hue is size and the size is an ordinary value for that attribute. And we can have another one which is the sides specified, in addition to the hue. So if the size is specified, we can actually have the size increased to the actual value stands for another sites. We can see that with the color enabled by the hue, we can have the one has a smaller dot and the six has a larger dot. So we can see on our plot, we can see the data points as bubbles, they have different sides and the colors and also everything is added nicely with a legend. This legend even creates the associated size for the certain thoughts. Even more you can specify how difference you want the size to be different. For example the list, if you specify the list is 15, the greatest is 20, you can actually make the size much more significant. And that is how you can make a graph very beautiful and informative as well. And in addition to this plot, you see that some of the dots are overlapped. You may want to have some transparency of the dots so you can see all the dots at the same time. To do that, we actually see that alpha in our mathplotlib which is to specify the transparency of your plot. If we execute this, we can see that this graph changed somehow, that is the transparency is 50% for all the data points. So you can see the overlapping of the two data points right here and someone is over here as well. So this is all the relationships we can play with just for tips and the total bill, and we can use different other attributes for the differentiation. We can separate in columns, we can separate in colors and sides and we can do a lot of things to it. And let's try some other relational plot. This one is we're going to see how the days and the total bill will be related. And we see, okay, so for Thursday, we have an ordinary range of the bill. But for the Saturday, we do see a larger range of the total bill, we have the minimum and then we have the maximum. And you can actually show this relationship with lines, the days and total bill. So that we can see the trend of how the total bill goes from Thursday to Sunday. Or if you have these days as quarters, you can see how your business is doing from one quarter to another quarter. And the light blue area is how your data is distributed. And this line is the mean of how your data is. So you can get the informative understanding of your data at the same time. Another plot we can do is to do the day and total bill and we can make some separation of the dataset as we did before using the hue. So for here we can see that okay, while the male customer follows this trend, the female customer follows this trend, which we see that they are behaved differently. And we can have the column as well, so that we can see how different time, lunch and dinner and how they are recorded differently and behave differently for the male and female customers. And there are many other ways to do the relational plot and this rel plot is just one default plot to deal with the basic kinds of relational plot. Actually in Seaborne, there are very specific plots you can use and you can add more advanced feature to this relational plot. I will leave that for you to find out and I will have that resource to you as a reading document. Okay, we're going to stop here, and in the next part, we're going to introduce you the distribution plots. I'll see you later.