Hi. In this video, we'll finally describe, implement, and analyze Knuth-Morris-Pratt algorithm for the exact pattern matching. Remember that exact pattern matching problem is a problem where we need to search for the pattern in the text and find all the occurrences. Before applying any algorithms, we will apply a clever trick, which is actually a pretty standard trick for problems about strings. We'll create a new string, which starts with the pattern, then we insert a special character called dollar. This special character is actually any character that is guaranteed to be not present neither in the pattern nor in the text. After that, we can concatenate all that with the text. Out of pattern and text, we create one string and we insert a special dollar character between them. Now we'll work with this string and we will compute the prefix function for this whole string. They state that to search for the pattern P in text T, after we computed prefix function s, we just need to consider all the positions i to the right from the dollar. If at some such position, the value of the prefix function is equal to the length of the pattern, it means that there is an occurrence of the pattern in the text. Well, why is that? Because if the prefix function is equal to the length of the pattern, it means that there is some suffix ending in this position, which is equal to the prefix of the string S with length equal to the length of the pattern, but prefix of the string S with length equal to the length of the pattern is equal to the pattern. It means that there is some suffix that is equal to the pattern. The suffix is guaranteed to be inside the text T because it's to the right from the dollar, and so there is an occurrence of the pattern in the string T. Now we only need to compute the position where it starts. If i is some position to the right from the dollar where the value of the prefix function is equal to the length of the pattern, then it means that this is the last position of the occurrence of the pattern in the text. To compute the first position, we need to subtract length of the pattern minus 1, but that would be the position in the string S. To get the position in the string T, we also need to subtract the length of the pattern and also one for the dollar. In total, we need to first subtract length of the pattern minus 1, then one more for the dollar and then length of the pattern for the pattern in the beginning of sting S. It turns out that in total, we need to subtract two times length of the pattern from the position to get the start of the pattern in the text. To understand why these are all the occurrences of the pattern in the text, let's analyze which values can prefix function take on the string S. First, we state that for all positions, the value of the prefix function is at most length of the pattern. Why is that? Because it's for some position, many of the prefix function is bigger than the length of the pattern and then the corresponding border would include not just the pattern, but also the dollar sign after it. But the dollar sign is guaranteed to be not present in pattern and in the text. It is present in the string S only once, and so it cannot occur at the border because that way it would occur both in the prefix border and in the suffix border. It would occur at least twice. For all positions i, values of the prefix function are less than or equal to the length of the pattern. If value of the prefix function is equal to the length of the pattern, we've already explained that it means that there is an occurrence of the pattern in the text and we can compute the position in the text as position i in the string S minus 2 times length of the pattern. But if the value of the prefix function is strictly less than the width of the pattern, then it means that there is no full occurrence of the pattern in the text ending in position i, because otherwise if there was such full occurrence, the corresponding border would have length of the pattern and then the value of the prefix function would be equal to the length of the pattern. If it is less, then there is no occurrence of the pattern ending in position i. Of course, every occurrence of the pattern S ends in some position and it also ends in some position in the string S. This way, we will find all the occurrences of the pattern in the text. Now let's implement this algorithm. Find all occurrences of the pattern P in the text T. First, we create string S by concatenating the pattern, the dollar sign, and the text T, then we compute the prefix function for this string S, then we initialize our resulting list of positions with an empty list, and then we go through all the positions to the right from the dollar sign, that is from position length of the pattern plus 1 to position length of the whole string minus 1. If the corresponding value of the prefix function is equal to the length of the pattern, we append i minus 2 times length of the pattern to the result and in the end return the result, which contains all the positions in the text where the pattern occurs, where it starts in the text. The lemma states that the running time of this algorithm, which is the Knuth-Morris-Pratt algorithm for exact pattern matching, is proportional to the length of the pattern plus length of the text, which is much, much better than the running time of the brute force algorithm, which was length of the pattern times length of the text. For example, for text of length billion and pattern of length 1,000, the brute force algorithm would need a trillion comparisons and the Knuth-Morris-Pratt would just make a billion comparisons. Billion comparisons can run on modern computers in maybe a second to 10 seconds, while a trillion would take probably around an hour. Building a string S out of the pattern and the text and the dollar sign is of course proportional to the sum of their lengths. Computing the prefix function is proportional to the length of the string S, which is proportional to the sum of the length of the pattern in the text. The last for-loop also runs for big O of the length of the string S, which is proportional to sum of the length of the pattern in the text. All in all, the running time of Knuth-Morris-Pratt algorithm is linear in terms of length of the pattern and length of the text. To conclude, now we can search for pattern in text in linear time. We can compute prefix function of any string in linear time and we can also enumerate all borders of a string using the values of the prefix function.