June 19, 2016

RegEx : The Right Way | * and + operators

02:52 Posted by DurgaSwaroop , , , , , No comments
Till now we have covered the Dot Operator and the ? Operator for optional characters. To see all the articles of this Regular Expressions series, click here .
In this lesson we are going to see how to match multiple characters at once. The number of characters can be anything. And, the * and + operators are going to help us with that.

Let's say we want to match ab,aab,aaab and so on.. From observation we can see that we have multiples a's followed by a 'b'. And, there is at-least one 'a' in the pattern. For these sort of situations we will use the + operator.
The + operator denotes that the preceding character needs to present at-least one time. So, you can match aaaaa and aaaaaaaaaaa with /a+/
Now, for the example at hand, we will use /a+b/ and it works as expected from the adjacent image.
Now, what if I want to match just b in the above pattern with out any a's. We will use the * operator for that.
So, as you can  see from the second image here, /a*b/ matches b and everything else too. So, the * operator matches any number of matches of a character including zero.
Now, let's look at an extension from this case and one that is useful almost everywhere.
Let's say you are parsing a webpage and you want to identify all the mp4 videos on that site. Essentially you are looking for a pattern like this, some_random_video_name.mp4. For that we can do something like this, /.+mp4/. And, that would indicate to the RegEx engine that we don't care what the name is as long as it is at least one character long and  there is an mp4 there. There is just one problem with this Regex, because this will match even "hellomp4", which is just a word that has the letters mp4 in it. So, to make sure we have the dot (.) included, we have to use /.+\.mp4/ and everything will be fine. The reason we have to use the extra '\.' there, is because if we just use the regular dot it will match any possible character since it is a predefined operator as discussed here.  And. hence we have to escape it by placing an extra '\' before that to make sure it is interpreted as a regular dot and not as the dot operator.
And, as expected it is matching only the mp4 videos as can be seen in the adjacent image.
And, finally a small exercise for you to try yourself. Write a RegEx that matches happy, cappy, chapp, appy, lappy .. i.e, any word  that contains app in them. Try it out and share your answers in the comments section.
Well, that is everything for this tutorial. Stay Tuned for more.
Happy RegExing!
PS: To try your regular expressions, you can use Regexr.com which is what I will be using  throughout this series.

If you have liked this article and would like to see more, subscribe to our Facebook and G+ pages.
Facebook page @ Facebook.com/freblogg
Google Plus Page @ Google.com/freblogg


Post a comment

Please Enter your comment here......