[MUSIC] So the Luxemburger basis consists of the implications from the canonical basis and the confident rules between neighboring frequent intents. Why is this sufficient? So let's see how we can check if an association rule, A implies B is frequent and confident enough. That is, if it is valid, Is A implies B valid for the given minsupport, And mincomf. Well it turns out that if we have the canonical basis and the confident roles between neighbor and frequent intents, that's enough to answer this question. And also if A implies B turns out to be frequent, this information is sufficient to compute the confidence of A implies B. Let's see how. So first of all, let's denote A double prime by C as a shortcut. And A union B double prime by D. Now we know that the support of A implies B is the same as the support of C implies D. And the same for confidence. So supp A implies B equals supp of C implies D. And similar thing is true for the confidence. So to determine the support and confidence of A implies D, it's efficient to determine the support and confidence of C implies D. First of all, let's check if these frequent. If it isn't frequent If it isn't frequent, then we don't have a rule that has D as a premise. And if it's frequent then we have rules corresponding to D as premise. So we can check this from this information, using this information. Now if D isn't frequent, then C implies D can't be frequent either. And so A implies B isn't frequent. So suppose the D is frequent. Then there are two cases. The first case is that it may happen that C=D. Well, A and B may be different but it may happen that A double prime is the same as (A union B) double prime, which means that A implies B is a valid implication. So, if C=D, then the confidence of (A implies B) =1. And so, these frequent, so these association rule is also frequent and its confidence is 1, so it's valid for the given main support and the main confidence. What happens if C is not equal to D? Well, in this case, we have D which is a frequent intent and C which is a subset of D and is also an intent. We have two frequent intents and one is a subset of the other. So there must be a chain of neighboring frequent intents between them, So, this channel looks like this C1=C and this C1 is different from the next neighboring intents. And then let's say we have k of them. And this Ck is in fact D. So we have a chain of frequent intents and, Their neighbors. And so for each pair in this chain we have a rule here. And because we have a rule here, we can compute the confidence of A implies B, simply as the product of confidences of rules Ci implies Ci+1, where i ranges from 1 to K-1. And we know the confidences of all these rules from here, from this part of the Luxemburger basis. What about the support of A implies B? Well the support of A implies B equals the support of D. Now lets look at an example. So this is the concept lattice of a formal context over attributes a, b, c, d, e, and f. And we have a 100 objects that have all the attributes f. We have 50 objects that have f and e. We have 40 objects that have a, b, c and so on. That's the meaning of labeling. Now let's say that our minimal support, Is 35. And our minimal confidence Well, let it be 0.35. First of all, we're going to look at the part of the lattice that satisfies this support threshold. So this will include all the concepts except for the bottom one, which has no objects at all, and this one that has 30 objects. 30 is less than 35, so we don't take it. So what we get is this upper part of the lattice, which is, by the way, the iceberg lattice. So, in fact, the problem of computing frequent attribute sets is precisely the problem of computing an iceberg lattice. But now we will also need the edges of this diagram, in this frequent part of our lattice. Right. So the Luxenburger basis consists of two parts. First we need the canonical basis and the canonical basis here, if we look at this part of the diagram will include the following implications, c implies ab. So if you have c, then you must have a and b. The confidence of this association rule is 1. B implies a, well because a and b label the same node and a implies b for the same reason. And that's all. Of course we have other implications but then not frequent. These are the only frequent implications from the canonical basis. So let's look at this more interesting second part. Confident rules between neighboring frequent intents. So these rules are rules between these two nodes, and these two nodes, and these two nodes and these two nodes this and this, and we don't need this rule because d is not frequent. And we also don't need this edge and this edge, because the bottom concept is not frequent. Right, so let's compute the confidence of each set rule. So this is the rule empty set implies f. It's confidence is one-third. Because we have 100 objects here. 100 objects that have f out of 300 objects that we have in total. What about this rule? Here are 50 of the rules. 50 of the objects that have f, also have this attribute e. So here we have the confidence one-half. And this confidence satisfies our threshold because one-half is greater than 0.35. So we add this rule f implies e and its confidence is one-half. We don't add this rule because one-third is less than 0.35. So this is the rule that we take. What about this rule? Here we have 150 divided by 300. This gives us one-half. So we take this rule, this is empty set implies e. Here we have 50 out of 150. So this is one-third and it doesn't pass our threshold. Here we have 40 divided by 100 which is, well, it's 0.4 or two fifth. And this is more than 0.35, so we add this rule ab implies c with a confidence two fifth. So we take this one and this one, and that's it. So this is the Luxenburger basis corresponding to our min support and min confidence. Now, suppose we lower our min confidence a little bit. So let's say our minconf equals well maybe 0.25, one-fourth. In this case, in addition to these three association rules, we would add other rules such as the rules with confidence one-third. So we would add empty set implies f, one-third. We would also add e implies f, with one-third and that's it. So this is the Luxemburger basis for minsupp equal to 35 and minconf equal to one-fourth. It includes all these association rules, plus these additional two. So what about the association rule empty set implies C. Let's check whether it satisfies our support and confidence thresholds. Well we're going to use this algorithm. First we compute the closure of empty set and c while the closure of empty set is empty set and the closure of c is a, b, c. So we know that c implies ab, so the closure of c is abc. So the support and confidence of empty set implies c is the same as the support and confidence of empty set implies abc. And abc is frequent. We know that it's frequent. It has a frequency 40, which is above 35. Good. So we are in one of these two cases. Well, C is empty, and D is abc, so C isn't equal to D. So we're here. So we have two different concept in term C, two different frequent intents. One is empty the one is abc. And indeed there's a chain that starts in the empty set and ends in abc. And it includes just one more constant intent, ab. So we have the following chain, empty set Ab, abc. And we know the confidence of both these rules. So empty implies ab, Well, this is, [LAUGH], we should know this rule. This is the one I forgot to mention. So, the confidence of this rule is 100 divided by 300, so this is one-third. And, so, this doesn't satisfy this threshold, but it satisfies this threshold. So we must add this association rule empty set implies ab with confidence one-third to the second version of our Luxemburger basis. Right. So now indeed we have this implication. This association rule empty set implies ab with confidence one-third. And we also have association rule ab implies a, b, c here with a confidence two-fifth.. So we use this rule. We multiply one-third and two-fifth and we get the confidence of, Empty set implies C. This is one-third times two-fifth and this is two fifteenth and this is less than 0.25. So this is all sessional, it's not confident enough. Its confidence is less than minconf. So when I say that it's not valid for all minsupp and minconf. This is how we can compute the support and confidence of every frequent association rule in our concept lattice. And even association rule is not frequent or is not confident enough, we can also recognize this. [MUSIC]