Regular Expressions Part 6

(Pssst! All the code referenced in this post can be found in https://github.com/NerdcoreSteve/regular-expressions. Pass it on!)

This is the final part 6 in a series on regular expressions. Have a look at part 1, part 2, part 3, part 4, and part 5 if you haven't already.

The Dreaded new Keyword Again

I've done it occasionally (and I feel properly ashamed :P ) but you can create a regex object by using new and the RegExp constructor:

console.log('Banana'.match(new RegExp('an?', 'gi')))  
[ 'an', 'an', 'a' ]

Note that the stuff that would normally go after the final / (like i or g) is a second argument to RegExp.

Building From Variables

I try to avoid using new but there is one case I can think of where you'd really want to, and that's when you need to build a regex from variables:

const  
    word_occurrence = (word, text) =>
        text
            .match(new RegExp(word, 'ig'))
            .length

console.log(  
    word_occurrence(
        'the',
        'The mango is the greatest fruit of all. The banana can\'t even compare.'))
3  

If we assume that word is the string 'the', then the expression new RegExp(word, 'ig') turns into /the/ig, a regex that globally matches the word 'the' (ignoring case).

Funnily enough, I don't use this technique all that much, but it's dead useful when I do need it.

Don't Be So Greedy

There's a problem with .* you might run into. As we've discussed before, it matches everything, but what if you wanted it to match everything until it reached a certain point?

Check-check check it oooooout:

console.log(  
    regexGlobalGroupCapture(
        /this(.*?)\./ig,
        'I will only match the bit after this: you are a monkey.'
            + ' This is another sentence.'
            + ' This potato has gone bad.'
            + ' The potato is trying to eat this cat.'))
[ ': you are a monkey',
  ' is another sentence',
  ' potato has gone bad',
  ' cat' ]

The function regexGlobalGroupCapture allows us to do global group capture, I talk about it here.

The .*? construction allows us to match everything...

until we get to the bit of the string that matches the bit of the regex after the .*?.

In our case that's just the period \. (remember, we need to escape the period with a \ to match it).

Matching a specific number of repetitions

console.log(  
    [
        'do do do do wah!',
        'do go out and clean up the do do',
        'do dee do dee do do do wah!',
    ]
        .map(s => s.match(/(:?do ){3}/g))
        .filter(x => x)
        .length)
2  

The above bit of code tells us the array only contains two strings with the word 'do' repeated 3 times.

What if we didn't care about a match unless it went above some number of repetitions? What if we didn't want to match 'argh!', but we did want to match 'aaaaaargh!'?

console.log(  
    [
        'argh!',
        'aaargh!',
        'aaaaaaaargh!',
        'aaaaaaaaaaaargh!',
    ]
        .map(s => s.match(/a{2,}/))
        .filter(x => x)
        .length)
3  

Because of the comma after the 2 in /a{2,}/, we only match if you're really angry. :P

What if we wanted to match repetitions within a range?

console.log(  
    [
        'banana',
        'banananana',
        'banananananananana',
    ]
        .map(s => s.match(/(?:an?){2,3}/))
        .filter(x => x)
        .map(x => x[0]))
[ 'anana', 'ananan', 'ananan' ]

At long last, we're showing some restraint.

Where To Go Next

Well, that's it for this series! I hope you learned a lot and are ready to start using regex in your projects. It's a really great tool to have in your toolbox.

If you want to learn more I'd recommend going to regular-expressions.info they have a page specific to JavaScript here. Also, if you want to do a particular thing in regex and don't know how, google probably does. :)

And you can always hit me up on my twitter @ProSteveSmith. I'd be happy to answer any questions you have. :)

Looking for a software developer?