[MUSIC] We have defined a closure operator related to implications and shown that it can be used to check if an implication follows from an implication set. But how do we compute the closure under implications? When the base set is finite, there's an almost straightforward algorithm to do this. Essentially, it follows the constructive definition of the closure operator we saw earlier. To compute the closure of set X under the implications from L, the algorithm goes through each implication A --> B from L and, whenever A is a subset of X, it adds B to X. In this case, people sometimes say that the implication A --> B "fires". If X hasn't changed after a full round through all the implications, the algorithm stops: at this moment, X satisfies all the implications from L and thus is closed. On the other hand, if X has changed, if a new attribute has been added to it, the algorithm repeats the whole process, because now some of the previously considered implications A --> B might have become applicable in the sense that A has become a subset of this enlarged X. When A --> B fires, it can be removed from L, because B is already a subset of X and will remain such forever. So A --> B cannot change X any further. How long does this take? Well, we have two nested loops here, "repeat until" and "for all". The "for all" loop goes through all implications. [INAUDIBLE] the list of implications becomes smaller at each iteration of the repeat until loop, but still. So in the worst case the "for all" loop has to go through roughly as many implications as L contains in the beginning. And this is every time, at every iteration of the "repeat until" loop. Now how many iterations does the outer loop make? Well, at each iteration of the outer loop, one of the implications fires. If none of the implications fire, then the algorithm stops immediately, because the set X doesn't change. But if at least one of implications fires, then the "repeat until" loop repeats itself. So, in the worst case, exactly one of the implications fires at each iteration of the "repeat until" loop. And this means that the algorithm is quadratic in the number of implications in the worst case. Let's look at when this worst case can happen. Suppose that our attribute set is all numbers from 1 to n, where n is some natural number. And let's say that X equals 1. So X is a single-element set containing only one number, 1, and L consists of implications of the form "i --> i + 1". So 1 --> 2, 2 --> 3, and so on. So a concrete example. Let's say that n = 5. Then L is {4 --> 5, 3 --> 4, 2 --> 3, 1 --> 2}. It's actually important how these implications are ordered, because the worst case happens only if L is the list ordered in such a way that implications with larger numbers in premises come before those with smaller numbers, like here. So we first have "4 --> 5", and then "3 --> 4", and so on. Let's see what happens when we compute the closure of X under L. After the first iteration X = {1, 2}. Why? Because the first iteration goes through all implications of L. So we see "4 --> 5", but we can't apply this implication because {4} is not a subset of X: X is {1}. We can't apply "3 --> 4" for the same reason, and we can't apply "2 --> 3". But we can apply "1 --> 2", because {1} is a subset of {1}. So we'll apply this implication. We add 2 to X. And so X = {1, 2} after the first iteration of the outer loop. Then goes the second iteration, and {4} is still not a subset of {1, 2}. {3} is still not a subset of {1, 2}, but now {2} is a subset of {1, 2}. So we add 3 to X. And now X is {1, 2, 3}. At the next iteration, X is {1, 2, 3, 4}. And after the final iteration, X is {1, 2, 3, 4, 5}. Well, actually this is not the final iteration. At this point, the set L is empty but the set X has changed. So we'll have to repeat our loop. Well, this will almost do no work, because the set L of implications is now empty. But still we make now the iteration of the outer loop, X doesn't change, it's still {1, 2, 3, 4, 5}, and the algorithm stops. So here we have to make as many iterations of the outer loop as there are implications in our set L. And every iteration of the outer loop will go through all remaining implications of L. And so, if we count how many iterations of the "for all" loop we do during the entire work of the algorithm, it turns out that we have n(n + 1)/2 iterations of the "for all" loop. And each such iteration requires O(n) time, because we do some set- theoretical operations, we check if a set is a subset of another set, and this is not for free. This operation also costs time proportional to the size of the base set which is n. There are other more efficient algorithms to do this. One famous algorithm is called LinClosure. It's well known in the database community. It's a linear-time algorithm that solves the same problem. It computes the closure of X under the implications from L, and it works in time linear in the combined size of the implications in L. So if you count how many symbols you need to write out all implications in L, and that's the size of your input. The algorithm was proposed in 1979 and this is the reference to this algorithm. [MUSIC]