[MUSIC] Now let me explain three kinds of graphing data first, let me show you how to draw a scatter plot by the way. When do you use a scatter plot? We are scattering data point on or two dimensional space. Surely you can scatter many data points on three dimensional space. But at this time I'm introducing scatter plot drawn or no two dimensional space. So you are scattering many data point on all two dimensional space in order to find out or pattern existing in the data. So the first line, you are now familiar with this first line or canvases created and X. There's only one X means that you are drawn using the whole canvas for drawing or figure. Mark sizes 80, I'm going to explain when this data is used. As soon, we are adding scatter plot using scattered function here. We are adding three scatter graphs. The first one is coming from Sentosa because we are placing Sentosa data point or not, two dimensional space. We need to choose two variables. The first various paddle length and paddle west because the toes a data set is data frames. So from data frame you can take all variable by using this syntax Sentosa but pedal length. This is what I already explained. When I explain, pandas data frame and marker start. It will be used for this sentosa kind and here Marcus eyes. Micro side S, S is micro side It is 80 and color. C red label is Sentosa, the next line same as B for except marker and color and label. The third one, marker and color, it's different. I'm using RGB because there are three kinds H color at this time you are adding edgy for this marker black and labor is Virginia right? And then the other command line. Yes familiar to you. Sit on the X label set on the white label and legend apart right or upper left. Let me ask you this one, then we see this one. Sentosa is group down here. First color located, this veronica is off here from this one. If you are classifying ideas in one of the three groups, you can easily identify. If use pedal length and west information. You can perfectly separate sentosa from the other two growth. In case of verse, cola and veronica. There is some overlapping areas, but at this time. The overlapping area is small, so it is relatively easy to separate one group from the other. So for classification it is better to use pedal information rather than we use sample information. Later, I will show you how much overlapping sample information. So and if you look carefully we used edge color only for veronica. That's why here's black etch colors. If you do not use each color, there's no are the color at the border of each marker? Surely you can use four to tie. If you use greater than the triangle shapes different. It is 90° Rotated Triangle, right? Surely, if you use less than the triangles different. And also you can use instead of plus, you can use let all in order to yours. Dot mark, so depending on your preference, you can change color and marco shape. And also you can add edge color or your omit edge color. The other cases is using full low. In the previous case, we are adding three scatter functions. Then surely we can use fig X and follow in order to draw bolt paragraphs sequentially, so name and kind. That's what I already created using the function. We are taking each one from the two list and it is plugged in to hear. Kind and pedal with marker. But at this time, if you look at the outcome, what is the difference between this one and the above? The marker is just one time, right? So Star Merkel is used, and mark edge color is also used. Black, so they are all wrapped by black lines. Labor information is taken from the name list. That's why and also legend is added. So if you use follow you can reduce the coding space, but there's it is benefit, but there's a cost. You cannot use different marker for each type for each kind. So there's a little bit of constraint. So this is one way of creating a scared plot using follow. And now, let me explain bar chart, bar plot. We are using also iris data set. The first line, I don't need to explain it. At this time you're using box-plot function and we are taking four variables from iris data set. So the first line sepal length, the sepal width, pedal length and pedal width. And after taking four variables converted into np array. Sometimes you don't need to convert this data frame/data framing to np array. Sometimes without converting to np array it works but in other cases it does not work, that's why I added np array here. A symbol blue and dot. Blue color and dot, this one is used for presenting out liars. Show means mean values will be presented. Now let me briefly explain what is box plot. Box plot shows distribution of variables. Sepal length, the distributional sepal length, the orange line or yellow line in the middle. That is the median value, median value of each variable. And green triangle is the mean, show mean, mean values. So in case of sepal length, the median value is almost the same as mean values. In case of pedal length, mean values are quite different from median value. Median value is the second quartile value. Second quartile is 50% time. First quartile from the smallest that this is the minimum, the upper tab is the maximum. Actually it is not the maximum or variable but it determined by the box plot definition. So the lowest bar here that is a minimum value. And this one is the first quartile value, the lower back, there's a box, here is a box. The base of the rectangle is representing first quartile. This orange or yellow bar is the second quartile median value. The upper line. This is the third quartile, means that 75% time. And the length between first quarter and the third quarter is called interquartile range. So this range, interquartile range is used to determine the maximum value and minimum value. The maximum value actually determined by the interquartile value multiplied by 1.5. Starting from this one. But you may think, no, no, no, no. The length of this line which is called fiscal fisk line. Fisker is should be longer than interquartile length. That's the definition of this fisk line. But at this time, far shorter then this one, why that happens because actually, 1.5 times this IQR interquartile range value should be probably somewhere here. But reading that range, the maximum values here. That's why the limit is here. Look the other case here sepal width into quartile range is short. So at this time the maximum value is this interquartile range at times 1.5 up and down. And those data point, blue data point above this fisk line limit they are called outliars. It means that in the other three cases there are no outliars defined by this box plot function. So that's why only in sepal width case there are or few outliars and this same parameter is used for presenting outliars. If you want to use different marker, you can choose for example this one, then triangle is used. So you can change the maker shape also and you can also ticklabels, so sepal length, sepal width, pedal length, pedal width. This is used for markers, or xticklabels, all others are explained also. Let me move on to pie chart. At this time, I'm going to draw two pie charts sitting side by side. Here's name John for persons and TV watching as soon as you get this data, actually I already explained. And I'm using this subplot function in order to create two subplots and canvases created this one. And assign name is fig and x1 and x2. So 1 by 2 means that pie chart horizontally sitting side by side. Figsize is 8.4, that is the canvas side. Reading this canvas, there will be two subplot and reading to pull where you placing two subplot names. And for each subplot object name, you are attaching to pie chart. You want pie chart and the other pie chart to ax2. And using what they thought TV watching the other case sns use explode the perimeter. Exploded means that, let me explain after showing the graph first because it is easier to explain the published values. In this case explored, because of this one, 0, 00 but only one has 0.1. That's why I said this. 2nd person's Sally. That's why this paramount value applies to Sally's value. So Sally's pie is separated from the pie chart. Explored means that a little bit separating dead slice. Laborers names here, John, Sally name is used for labeling text properties, font size, and controlling names. The one font size of names and the labels and auto PCT. Auto PCT determines how to present data. This is familiar, I already explained tattoo F in explaining F string. The same logic is applied here. But at this time before and after this paramount values we are placed in percent. Here's another person because we are using another person in order to present show percent sign here. Without this person sign percent sign is not presented in pie chart. So this the syntax for showing data. Then actually here's. Interior data is used but Those interior data is converted into percent. So If you add this of what value, 200 right. Four persons TV watching time aggregated time is 200. Among 200, 40 minutes comes from sally, it means that it is 20 point. So 20 in case of certainly 20 point is used and you can use colors. List in order to determine colors of each slides. If you ommit this color information, then pie chart that plant live naturally automatically assigned colors here. So in case the second case, second pie chart color information is not used. So this is the color automatically was signed by Medlock live. And shadow means that here's a shadow a little bit making a kind of thick slice. And start angle means that location of each slice so 90 degree. This is the left side, 180 sadly goes back the other side. And Richie props in the second case, this is new paramount values with in this case this is doughnut looks like doughnut. So with determines the size of circle in the pie chart. So if you make this bigger like 7 so do not become thicker. If you videos fine then do not becomes thinner. So probably you're just the parliament in order to looks better and then its color. This is y between H here is y right and line is the width of the white line. That is the paramount value and the other paramount are all explained. Here's another command line X Is equal. Without this one the soccer may not be perfect circle. It could be on oval but by adding this one, you make pie chart a perfect circle. So those are the parameter values that are used in creating two pi chart. Now here's a review questions role for math plot lift. Was built on non pie r so it is safe to use the data of r rather than data frame type. That's what I already explained. So hystogram returns frequency and being values in a dump I R A data type. Because matt plan basically built on huge, is pie or a data structure.