So I think that what Shawn did with showing how Burrows-Wheeler can be applied to pattern matching was really awesome. But, I would like to point out that we are kind of back at an issue that we were having when we used the suffix tree, which is that he has been able to identify whether or not each pattern occurs. In the genomes using the Burrows-Wheeler Transform. But remember that the multiple pattern matching problem says we need to also find the positions. Okay? We need to know where it is that these, that pattern matches are happening. Okay, so the question then is can we use Burrows-Wheeler Transform to reveal these positions? If we go back to our example, we know that ana occurs three times in Panama bananas but if we're using Burrows-Wheeler and we just have the first and the last column we don't know where it is that these pattern matches are occurring. So we'll use, to help us out we're going to use a data selector called a suffix array. And this is going to hold the starting position of each suffix beginning a row. What do I mean by a suffix beginning a row? Well, each row of the Burrows-Wheeler transform corresponds to a cyclic rotation of our original string. So if it's a cyclic rotation, then that means you can walk to some point and reach the $ and that walking corresponds to a suffix. So the first row is just the $ sign, so it's suffix that ends the string and that's in position 13, so we put a 13 on that row. The next rows begins with a bananas and the a bananas is in position one, two, three, four, five. Starting position five of the original string, so we put a five in the second element of the suffix array. And we go through this. So amabananas is in position three and so on. So we get one, seven, nine, and so on. So we're reconstructing an entire suffix array, this way and so now we have the suffix array. So how is that helpful to first to find the matches? Well, if we look at the three occurrences ana that we found, we just say what are the corresponding elements of the suffix array and this tells us that ana occurs at positions one, seven, and nine of panama bananas so suffix array values are one, seven, and nine and here is starting position one, seven, and nine which we saw before. The issue though is I know what Shawn will say which is how much memory is it that the suffix array took? And this is an issue. In fact, if you represent integers as using four bytes, then the memory is going to be on the order of 4 times the length of genome. So when we used the suffix tree we had about 20 times the length of genome. Now maybe we're down to about 4 times the length of genome to store the suffix array. And that's better. In fact the suffix array is hiding in the suffix tree that we were working with before. Remember we labelled the nodes of the suffix tree with the starting position of the suffix that ended at that leaf. And if we do what's called an in order traversal of the suffix tree then we can reconstruct the suffix array. By going down to five, and then we go sort of down and to the left as much as possible, in each step. So we go down and to the left as much as we can and we hit five. Then five is taken so we go the next note over and that's three. Then, we go down and we can't go to these two notes, but we go here and then we go left So that it gives us one and so on. So if we traverse the tree in this manner and go down to the leaves, then we're able to see this correspond, that the suffix array is hiding in the suffix tree. Okay so then the question would be now that we have a data structure Burrows-Wheeler which is about the length of the genome, now we have the suffix array Which is about 4 times the length of genome, maybe. And, the question would be, can we get this down to approximately the length of genome. Can we shorten it even more, right? And that's what I hope Shawn will be able to tell us about in the next section.