So in this lesson, I'm going to discuss filter methods for feature selection. So, for example if we have two classes, duck and swan. So the number of classes. The filter method measure some kind of uncertainty, distance or similarity, dependence, or consistency between classes. They are independent from the data analytics task, and therefore can be applied to several different tasks. For example, in this class, we're going to see several data analytics task to analyze the data. Among which prediction and clustering are going to be the main ones. And so, you can perform the feature selection independently from these tasks. Those are more simple than the wrapper methods, and consequently they are more efficient, it means they take less time and resources to execute. Some example on the feature selection, is that you may between two to two classes. For each attribute you can calculate a T-test to know whether the values, average values in each class for each feature are independent or not. So to see there is a significant difference or not. When we talk about significance different, we refer to a certain p-value. We're going to calculate a T-test associated with the p-value and if the p-value resulting is less than 0.05, we're going to say that the features for example are independent. Of course this requires numeric features. Another possibility with numeric features is to calculate a coefficient correlation between a feature and a class, particularly if the class is numeric. Or you can also use a Chi-square when you have nominal feature or nominal class. And there are many other methods such as this one. What is interesting is that some of these methods can provide actually a ranking of the features, where the features are ranked based on the differences between two classes. And you can also extend that beyond two classes, more than two classes. A method that is often used is the Between Sum of Squares / Within Sum of Squares, BSS/WSS method, originally created by Dudoit et al in 2001. If you have two classes A and B where there are na samples in A, nb samples in B. m being the overall mean, ma the mean in A, mb is the mean in B. BSS represents a measure of separation between the classes. So, you calculate it as na(ma- m) squared + nb(mb- m) squared. So because m is a global overall mean, this calculates the distance between the classes. Well, WSS is a measure of cohesion on each class, so I'm going to sum all the distances from the mean inside class A. going to square them and add them and then I'm going to do the same for class B. And for other of that is going to be an evaluation of the distance within each class. BSS/WSS is a ratio between to within groups sum of squares. And what is interesting in this method is that it can be used for ranking features by their discrimination power between two classes. So it's going to tell you number one, this is the most discriminating feature between these two classes, number two is the second one, number three is the third one. So this is a interesting method. And there are, again, many other filter methods.