Regular Expressions: Icebreakers

Few days back, there was a knowledge session on regular expressions within my team. After discussing the usual topics like greedy & lazy quantifiers, backreferences, etc, we started analyzing match results for few expressions. I ‘ve familiarity with regex and used them in majority of my throw-away scripts and I thought I knew regex unless been baffled with the simple questions from the team. I list down few of those simplest of the simple patterns and what they match and why ( which actually led to me learn the rules of the game),

Before even starting to look at them, did I mentioned earlier that regex engine would start its search just before the first character of the string ? If not, let me tell you now, it need to start before the first character, if and all the patterns  contains anchors ( ^, \b, etc ), it needs to check them too. And the search would go beyond the last character in the string and now you know why ( to match $, \b, etc ).

(i) x*

pattern : x*

string :foxxx

Matched: <<<  >>> foxxx

Explanation: As mentioned in the rule 2 here,  the greediness would always try to match more, hence read the pattern ‘x*’ as  ‘match more occurrence of x or nothing’. And the engine going to do its search character by character in the string. Since it could not find any ‘x’ to match at the starting position, it tries with its other choice ‘ match nothing’ and it succeeds.

(ii) .*

pattern : .*

string :foxxx

Matched: <<< foxxx >>>

Explanation: ‘.’ matches anything other than ‘\n’. Though the pattern ‘.*’ can be read as ‘match more of any characters other than ‘\n’ or nothing’, the rule of greediness gives the preference to match more characters.

(iii) x*

pattern : x*

string : xxxfoxxx

Matched: <<< xxx >>> foxxx

Explanation: Same greediness favors the match more criteria.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: