[MUSIC] At that stage, before presenting isograms or gyros that summarize and describe numerical data we need to consider a frequency distribution for numerical data. As well as the frequency distribution for categorical data, a frequency distribution for numerical data is a table that summarizes data. In order to do it, each class is listed in the left side of the column and the number of the observation in each classes, is in the right column. However, in order to identify the classes, or intervals for a frequency distribution, in case of numerical data there is no straight forward rule. We can follow a scheme which help a lot in this process. That scheme provides us with some general rules that we should use. Firstly, we need to decide how many classes should be used and the wideness of each class. As for rule one, which refers to the number of classes, usually the number of classes used in a frequency distribution is decided somehow arbitrarily. However, the experience and the practice can be a useful guide in that choice. It comes from the experience that why larger data sets require more classes, smaller data sets instead require fewer classes. Also, if we select too few classes, we risk to lose important characteristics that may emerge from the data. By contrast, if we select a huge number of classes we can figure out that some interval does not contain observations, or have a very small frequency. The second rule refers to the wideness of each class. The class width is equal to the largest data value minus the smallest data value divided by the number of intervals. The class width must always be rounded upward, this makes sure that all of the observations are included in the frequency distribution table. Finally, according to the fourth rule, classes must be inclusive and non-overlapping. Each observation must not belong to more than one class. For example, if we consider a frequency distribution for the ages of a group of people, if the frequency distribution contains the classes like age 20 to age 30 and age 30 to age 40. It will be difficult for us to allocate people 30 years old. Therefore, we understand why boundaries of each class must be clear and defined so that to improve a clear understanding and interpretation of the data. In the previous section we have seen a frequency distribution. We have also studied what is a relative frequency distribution. And now we look at two special frequency distributions, the cumulative frequency distribution and the relative cumulative frequency distribution. A cumulative frequency distribution contains the sum of all the observations, whose values are less than the upper limit for each class. In order to build up a frequency distribution, we add the frequency of all the frequency distribution classes until we reach the present class, which is included. In a relative cumulative frequency distribution, cumulative frequency can be represented as cumulative proportions or percents. In order to visualize frequency distributions, we need to graph this information. In this section, we discuss two graphs histograms and ogives. An histogram is a graph that contains vertical bars constructed on a horizontal line that is marked off with intervals for the variables being displayed. The intervals correspond to the classes over frequency distribution table. The height of each bar is proportional to the number of the observations in that interval. The number of the observations can be displayed above the bars. At this stage, we can understand the main difference between a bar chart and an histogram. In a bar chart, each column represent a group of categorical variable. While in a histogram each column represents a group of quantitative variable. Because of this differences it is always appropriate to talk about skewness of an histogram. The skewness refers to the tendency of the observations to fall more on the low end or high end of the x axis. With a chart however the x axis does not have a low end or a high end. Because on the x-axis, we have categorical and non-quantitative additives. This is why it is not appropriate to comment on the skewness of a bar chart, why? As we will see in a while, the histogram instead, is the appropriate way to look at the shape. Ogive, also called community line graph, is a line that connects points that are the commutative percent of the observations below the upper limit of each interval in a cumulative frequency distribution. [MUSIC]