[MUSIC] Support and confidence are two parameters that show how interested an associate role is. People sometimes use other criteria to evaluate the quality often association rule, but these two are the most typical ones. So, let's fix. A minimal support threshold. Let's cal it minsupp. And we'll say that set A is frequent. If it is at least equal to main support. We'll also say that an association rule A implies B is frequent. If its support is equal to or greater than a mean support. So, we'll fix this min's apart threshold and we'll also fix a threshold for confidence. Minimal confidence, minconf. So, the association rule mining problem can be stated as follows. Find all association rules with support and confidence greater than or equal to min support, and min confidence respectively Well, as I said, other criteria then support and confidence are sometimes used and then the problem statement is a bit different, but this is the most typical problem statement for association rule mining. And that's how it's usually solved. First, we find all frequent sets. All attribute sets with support at least equal to minsupport. All great that the minsupport and then for it such set. We generate a number of association rules, but output only those that has confidence high enough. So, we'll output all the cessation rules of the form A implies m if a confidence of such rule. Is at least min conf. Well, this is the general schema, but sometimes variations are possible. First, do we really have to find all frequent sets explicitly? Well, not really. Because if we say it's frequent, then ever its upset is also frequent. That's the main property of a cessation rules. That's the main property of frequent sets on which algorithms are based. So, why is it solved? Well, every set A is frequent, then it is contained in a large number of objects. But then if you take any of its subset, it is going to be contained in the same objects and maybe some additional objects. So, its support can only be bigger or equal to this upper or the set A. So, subset of frequent sets are frequent. And therefore it's efficient to find only maximal frequent sets. Once you found maximal frequent sets, you can compute all frequent sets out of them. Well, anyway, you have to compute frequent sets or maximal frequent sets and this is probably the most computational intensive part and on these two. So, there are lots of algorithms that find all frequent sets or all maximal frequent sets. Apriori is probably the most famous of them, but there are more efficient algorithms, such LCM or FP-Growth. And some of the FCA algorithms can also be adapted. To do this. And actually, FCA has more to say about all this. So don't worry, we have to find all association rules. One way, compute implications which we're trying and compute the economical basis, a small certain implications from which all other implications follow. Can we do something similar for association rules? Yes, something similar. First of all, let's talk a little bit about frequency set that are closed Frequent closed sets. Let's look at a set A double prime, that's a closed set. Its support is by definition, the size of A triple prime, but the size of A triple prime is the same as the size of A prime, because A triple prime equals A prime and this is precisely the support of A. So if you have access to the closure operator, the double prime, it's sufficient to know the support of every closed set to know the support of every set. And therefore, it's sufficient to consider only frequently closed sets and some of the most efficient algorithms use this fact and compute only frequent close set. Now, let's look at the support of an association rule A implies B. This is by definition, the support of a union B divided by the support of B. And now by this property, this is t he same as the support of A union B, double prime divided by the support of B, double prime and this is precisely the support of the association rule. Sorry. So of course, the support of implies B, the support of A union B divided by the support of A not B. And here, we have not B double prime, but A double prime and then this is exactly the support of the association rule A double prime implies A union B double prime. So, it rather access to the closure operator and we're now in the support of every association rule between two closed sets. So between A and B such that A and B are closed, then we can compute the support of any association rule, whatsoever. This means that we don't really need to store all association. Also, if we have access to the closure operator and if we have enough computation power to compute the closure. What about confidence? Well, the confidence of A implies B is also going to be the confidence of A double prime implies a union B double prime and it's easy to check using the definitions. This makes it possible to define a sort of a basis for association rules. So, this basis will include the canonical basis for implications. In other words, for association rules with confidence one. And also, it will include confident. Rules. That is ules with high confidence, with confidence above mean conf, among neighboring. Frequent. Intent. So, intents that are neighbors in the. This is sometimes called Luxemburger basis. Although the real Luxemburger basis is a bit more sophisticated than that, it's actually smaller. Still, we'll use this term. We'll call it Luxemburger basis basis. And in the next video, we'll see how to use it to derive support and confidences for association rules from supports and confidences of association rules from Luxemburger basis. [MUSIC]