I have said that if we would like to design an approximate data structure for finding the top key elements in a data stream, it is useful to first design a data structure for ApproxPointQuery. So now let me make this statement precise and give a formal reduction. I will show that if we have the data structure for PointQuery, then an exact data structure for find top follows. And, similarly, if we have data structure for ApproxPointQuery, then approximate find top follows as well. And I'll do this in a few steps. And I'll start with the simplest possible case. Suppose that we want to find the top one, the most frequent element in the data stream, so k is 1. Now we assume that we have a data structure for exact PointQuery. We want to find top data structure for k equals 1. Look, so we're given a stream S as a sequence of items, i1, i2 through iN. We will maintain a data structure for PointQuery. Besides that, we will maintain two other variables. One is the current maximum, the current most frequent element in the stream so far. The second one is its frequency. We initialize them to null and zero initially, and then we look at every position p between 1 and N on the stream. We'll look at the data item that arrives in position p. We call the PointQuery data structure to get the count of this item so far. Let's call it f. Now, we check. If the current maximum is the data item that we're looking at, then we simply update its count and proceed. If we're not actually storing any of the data items, which only happens at the very beginning of the stream, then we store the item ip, and initialize its frequency to 1. Now otherwise, we compare the frequency of the item that we're storing to the frequency of our current item, as reported by PointQuery. And if one is smaller than the other one, then we just swap them. Okay, so why does this work? Well, because in each position in the stream the current maximum, the variable that we're storing is either NULL or actually contains the current most frequent element so far. Good, so this is a very simple reduction. Now does it generalize from k equals 1 to k bigger than 1? Well yes, it does. And it works as follows. Again, we'll assume that we have an exact PointQuery data structure, which is a somewhat unrealistic assumption, but it's useful for our development. Now we're given a stream of data items. We'll maintain the PointQuery data structure, but now instead of just two variables, we'll maintain a heap H of at most k elements by count, by their count so far in the stream. Good, so for each position p between 1 and n, we look at the current data item, we ask PointQuery to give us the frequency of this data item so far. We'll call it f. If the heap contains our data item already, we'll simply update its count and continue. Otherwise, if the heap contains fewer than k elements, we just add our item to the heap with the current count. Now if the heap actually contains exactly k elements, then we'll look at the minimum, the least frequent element that we're currently storing in the heap. If that element is less frequent than the one we're holding in our hand, ip, then we evict the least frequent element from the heap and insert our current element with the count as given by PointQuery. Okay, so why does this work? Well, it is not hard to see that following is true. For simplicity, let's assume that the set of top k, in most frequent items is actually unique. The argument is very similar without that, and is left as an exercise. So in this case for every item i, in the top k, let pi denote the last position in the stream where i occurs. We'll notice that then at position pi, our element i will be inserted into the heap if it wasn't there before, and, after that, it will never be evicted. And so the reduction actually works. Good, so this is the reduction that we got. If we have an exact data structure for maintaining exact PointQuery, then we can find the top k items quite easily. Well, the question is why is this useful? Because before we argued that exact PointQuery is unrealistic. We need to pretty much store the entire stream. Well it turns out that the exactly same reduction works for the approximate version of the problem. If we have a small space data structure for ApproxPointQuery, and that's something that actually exists and we will construct it at the end of this lecture, then we can approximately find the top key elements using essentially the same set of code. Just whenever we look at an item ip, the current data item arriving in the stream, we ask our ApproxPointQuery to give us an estimate of the count, and use that estimate just as if it was the exact. Okay so, the proof of this is very similar. One can check that if we have a procedure data structure ApproxPointQuery, that for every element i at every point in the stream returns an estimate of i's count so far, which is off by at most fk, which is the final frequency of the kth dominant element times epsilon, then we can get a data structure for approximate top k. Okay, so now that we have this reduction, in the rest of the lecture, we will concentrate on designing a small space data structure for solving the ApproxPointQuery problem. So we want to design a small space data structure that observes the date items in the stream, and at the end of the stream, can be queried to give an approximate answer to the number of occurrences of any given data item so far. What we will have to specify are the actual parameters of our data structure, so how much space do we use? What is the quality of approximation that we provide? And what is the success probability with which we work? This will be done in the rest of the lecture.