Hello and welcome to secure JavaScript programming with [inaudible]. In this video, we will talk about regular expressions. Regular expressions are very complex feature in JavaScript and they can lead to some security issues. Before we dig into the next video with a whole regular expression can bring denial of service. Today, we will dig into what are regular expression in JavaScript and why is a matter. You might realize that in this course, a lot of the courses are more about to understand the insides of JavaScript and understand security issues directly. That's because they will hide the detail. Well, regular expression is a class in JavaScript, and you have two ways to define a regular expression. First one is to use a direct expression. To build a regular expression, you will start with a slash character, then you will write your regular expression. Let's say we want two regular expressions that match this foo. Don't worry we'll go more in details about regular expression after. Then you can close a regular expression and you can put some flags here. The flags are actually tokens that will tell the engine on who to manipulate regular expression. One of the very used flag is i for case insensitive. You can also build a regular expression from a string, with a regular expression constructor. As a first argument we will write the regular expression and as a second the flag or the flags. If we console log re and re2, even if they're two distinct objects, we see that they have the same string representation and they will match the same things. We have two different instances of regular expressions that have the same content. How do you use a regular expression in JavaScript? Well, there are two ways to do so. Either you use the method that are on the regex class, so namely exec and test, either you use the methods on the string class. Let's start with the method on the regex class. We have a first regular expression that matches ab, let's just take that, it matches ab. If we want to use exec on the first string which is abc, we would do console.log reg.exec string. Because we want to show the picture descriptive, let's put it multiple time. Here we see that it has actually matched ab multiple times. Let's put something else here and let's do this. Here we see that it gives null, so either it should return a table, either it will return null. Now let's change the regular expression with the g flag and we can see that the value has changed. Let me do that again. With the regular expression just mentioned ab, and it's put ab, ab in the string, it tells us that it matches ab, so there is a match. If that's the index is zero but we know that we have ab twice, so we can use the g flag to tell the regular expression that we want to match ab multiple times in the same string. However, as you can see, the first call returns the same thing as previously. The second call will return something else with the second match because it tells us about the second match and then it tells us null, and then it goes back to the first match. That's actually because regular expressions in JavaScript have a state. If I do console.log reg.lastindex, we will display the last index property of the regular expression and if we run that, we saw that it's four, null and go back to zero. Regular expression has a cursor and if you call exec multiple time and you have the g flag, it will actually impact the cursor. That's why I usually don't recommend using the exec method at all because it can bring some very weird situations. For instance, if I use a second string, there will be no second match, however, there is ab in this string. But since the cursor is at two, and that's the match is at one, well, it doesn't work. Regular expression of a state, you can, of course, init it by setting it back to zero, but that may be extremely complex to follow up with. Also, you've got another method that is named test. Test just returns a Boolean. Let's do something else. As you can see I'm very imaginative with my names, with my values, test, string, and now string 2. If we run this, we saw that the first one returns true because this string matches the RegExp. This one returns false because the string does not match this or RegExp. That's already alert and now we need to move on to wet as a method on the string class regarding RegExp just give me a sec. Sorry. You've got multiple methods on the string class, namely: match, match all, search, replace, replace all, and split. Let's take a RegExp and here, let's put something else. We can try both. If we use the first string and we do str.match reg and I forgot to console that log. Let's do that with the second string 2. Twice. Here we've got all the abs so the state of the RegExp doesn't change, which is logical because we are not coding method on the RegExp class, but on the string class, and here it returns no. We have a table with the list of things. Now, if you want to try with search. Let's see what search gives us. Search gives us the first index of the pattern. If I do this, we will get three. It gives us the first index of the substrings that matches the RegExp. Now, let's try with replace. Replace takes a second argument. Up replace, and actually tells to replace the value with this web. You can notice that when we use match or replace, they give a result that is similar to what we would expect of match role or replace all. That's because we used the g flag on the RegExp. If I remove the g flag on these, the first occurrence is actually replaced. If I do match on the first one, now that we remove the g, only one is actually found. So far so good. Last one is actually one of my favorite methods, it's split. Split actually returns an array by splitting around the substrings that match the RegExp. In the first case, of course, it removes the pattern so I need to perturbative buffering. In the first case it returns only Fs because it splits the string on ABs and that's pretty cool. On the second one, it returns an array with a single element because the string is not inside it. If we put a J, it will actually not change anything. This method is not impacted by a J unlike all others. That's how you use a regular expression in JavaScript. But how do you build a regular expression? I will tell you a secret, but don't tell the rest of the world. I am actually a very lazy developer when it comes to regular expressions. I go always to this website, RegEx 101.com, I select JavaScript. It's actually important because different technologies have different JavaScript, a RegEx coverage and then you can build a regular expression. Before building an expression, I usually recommend doing a test string first and now we want to build a regular expression. For instance, such will capture the word hello. First of all, let's build a regular expression that matches the word hello. This one is easy, we just need to write hello. Now, we already have modifications that are available here and it tells us about it and now, we know that hello is actually matched. When you do a regular expression, you might want to use capture, which actually gives you the sub-string as a result of the command. It gives you an easier way to get the thing you match directly. Here, with parentheses, we are able to capture the string, hello. Let's remove the parentheses. Now let's say we just want to match any world. A world is actually something that is not a whitespace. In that case, you can use a shorthand name and a \w that actually matches every letter. Then you can use a transformer plus. Let's go to the explanation of our regex. This w matches any word character, anything in a-z, or A-Z, or 0-9, with the underscore. The quantifier tells us that we want to merge that between one and unlimited. We could rewrite this regex, a-z. In that case, it will match any lowercase world. Then we want also uppercase world. If I write hEllo world, it still works. But here it matches the characters one by one. We want to tell it that we want to merge them by group. We can use the plus to tell, at least one of them, any builds group. But plus is actually equivalent to this. This is a way to write how many character you want to grab. Let's say I just want to grab at most three characters. I will do it this way. Meaning that just take between one and three characters. The first match is hel. The second match is llo. Then we have a whitespace that doesn't qualify. We've got wor and ld. But let's say now that I want to match only the first three character of the first world. I will do one, three. Then I will use another quantifiers that tells only with the start of the line. These are sets, the position at the start of the line. We would get what we want. Now let's say I just want the last three characters of the world strain. I would use a dollar sign at the end. It will say assert position at the end of the line. We've got to match an l and d. That's already alert, but there are a few extra things to know. You can also have dots that matches pretty much everything, that is not a whitespace. Matches any character except for line terminator. It actually matches everything. The star is the equivalent of zero to infinity, meaning matching any number of character between zero and infinity. Let me just check my notes quickly. Here it is for a quick crash course about regular expressions. As an exercise, let's try to match all the worlds that start with w. We would do w. Then we want the rest of the world, a, z plus. Here we will match all the world starting with w, so abc, [inaudible]. Here we need at least one character after w. If I want to add any number of characters I want, I would do that. Let's show an extra world. Yeah is not in it, but what is in it. We're matching all lowercase world with w. Now, let's try to put a W. It doesn't match. We have two solution. Either we put an or statement, we can't do that with a bar, or we can add the flag insensitive to tell it not to where, to care about the case. We even know now how to do alternatives in our cases. That's it for this crash course that had to use regex in JavaScript. There are a lot of other things, but here are the bases that you need to understand the security implication of that in the next video. Thanks so much for watching this somehow a bit too long video. See you soon, hopefully, in the next one. Have a great day.