Okay, let's learn how to do pattern matching with Burrows-Wheeler Transform. Let me first summarize what we learned about pattern matching with the suffix tree. The runtime is equal to O(length of the text + total length of all patterns). Memory is, in the best implementation, known today is 20 * length of the text, or which is high for live strengths like human genome. So it would be in nuclear diclonk. So the question we will try to address in this lesson is, can we use Burrows-Wheeler Transform to design a more memory efficient linear-time algorithm for multiple pattern matching? So let's see how we can do this. Let's search for ana in panamabananas. Well, we'll definitely start by noticing that there are six rows that start from letter a. But when we look, and please notice also that we are currently matching the last symbol in ana as in the first one. This will be important. So there are six rows starting from a, but only three of them are ending in n. What we need, because we are looking for matching the last two symbols now of ana, which is na. So the mental attention to these three symbols, and using the first last property, we can figure out where these three n's hide in the first column of our Burrows-Wheeler matrix, here, here, and here. After they found where they appear in the first column, we know where na appears. In the string can be actually found, three matches of na. This is the last two symbol in ana. Let's now try to match the first symbol in ana, and we know where to look for this first symbol. We'll look for them correspondingly in the last columns of these three strings. And after we found a in these three rows, then using again the first last property, we find where these three occurrences of ana appears in the beginning of our cyclic rotation. As a result, we found three matches of ana. Let me specify some details of the algorithm that we just discussed. We will use two pointers, top and bottom, that specify the range of positions in the Burrow-Wheelers matrix that we are interested in. In the beginning, top will go to 0 and bottom equal to 13, to cover all positions in the text. In the next iteration, the range of position we are interested in is narrowed to all position where a appears in the first column. Then, what do you do afterwards? We are looking for the next symbol, which is n in ana, and we are looking for the first occurrence of this symbol in the last column among positions from top to bottom, among rows from top to bottom. And likewise, afterwards, we are looking for the last occurrence of the symbol. As soon as we found the first and last occurrence of this symbol in this case, and the first last property will tell us where this n and all n's in between are hiding in the first column. As a result, the pointers top and bottom equal to 1 and 6 are changing into 9 and 11, they narrow the search. And then we continue further, and that's how we find the positions of ana in the text. The algorithm that I just described translate in the following BWMatching pseudocode. And you can see lines in green describe what we have been doing with these top and bottom pointers. Note that we are using last to first array and given a symbol at position index in lastColumn, LastToFirst index defines the position of the symbol in the first column. So it's implement first to last property. It looks like now, finally, we are done. We have a very first pattern matching algorithm based on Burrows-Wheeler Transform, and it has good memory footprint. The only problem, though, is that BWMatching is very slow. It analyzes every symbol from top to bottom in the last column in each step. What should we do? The trick here is to introduce the count array. And the count array describes the number of appearances of a given symbol in the first i positions of the last quote. This slide shows the count array, and ardent is the count array. We can design a better version of BWMatching by substituting four green lines that we discussed before by two green lines that are using the count array. And as you can see, we don't need anymore to explore every symbol between top and bottom indices in the last column. If you are wondering about the details of transformation from the previous four lines into two lines using the count array, check our Coursera course or get details from our books that describes this transformation. So it looks like finally, after all these complications, we are done. But there is still one question we found quanza. Where are the matches that they found? Where do they appear in the text?