Regular Expressions Part 1

(Pssst! All the code referenced in this post can be found in https://github.com/NerdcoreSteve/regular-expressions. Pass it on!)

I've not written about regular expressions yet? Really? Ok, brace yourselves, this gonna be a long series. Let's get started with part 1, then you can read part 2 and part 3.

What is Even a Regex?

console.log('a'.match(/a/))  
[ 'a', index: 0, input: 'a' ]

What's goin' on here?

A regular expression (or regex for short) is a pattern of characters. That's all it is. We tell JavaScript that we want to create a regex using the / characters around the pattern. For example: /a/, or /boogers/.

console.log('don\'t eat boogers'.match(/boogers/))  
[ 'boogers', index: 10, input: 'don\'t eat boogers' ]

We can use the JavaScript string's match method to see if the string matches the pattern. The match method takes a regex and returns, well, you know I'm not sure if it's an array or an object. It seems like a bit of both.

[ 'boogers', index: 10, input: 'don\'t eat boogers' ]

The first bit, 'boogers', is the part of the string that matches the pattern. The second bit, index: 10, is the index of the string at which the match was found. input: 'don\'t eat boogers' shows us the original string.

A regex pattern can be any string of characters:

console.log(  
    'Hello my honey, hello my baby, hello my ragtime gal!'
        .match(/Hello my honey, hello my baby, hello my ragtime gal!/))
[ 'Hello my honey, hello my baby, hello my ragtime gal!',
  index: 0,
  input: 'Hello my honey, hello my baby, hello my ragtime gal!' ]

A regex pattern can be used to match all of a string (like the example above) or just part of a string (like this example below.)

console.log(  
    'Hello my honey, hello my baby, hello my ragtime gal!'
        .match(/my/))
[ 'my',
  index: 6,
  input: 'Hello my honey, hello my baby, hello my ragtime gal!' ]

Global Matching

Did you notice that .match only grabs the first instance of a match? We can fix that:

console.log('doo doo doo doowah!'.match(/doo/g))  
[ 'doo', 'doo', 'doo', 'doo' ]

See the g after the /? That tells JavaScript we're looking for a global match. If we do the global match, we only get an array of substrings. One substring for each match. Four doo's in this case.

If the above was all regular expressions could do for us, I'm not sure why we'd bother, but they can do so much more.

Character Classes

console.log('I\'m Batman!'.match(/[aeiou]/g))  
[ 'a', 'a' ]

[aeiou] is what's called a character class in regular expression land. By using the character class [aeiou] we're saying in this regex "match any character that is a, e, i, o, or u".

We can build any character class we want by including the characters we want in the character class in brackets. [abc] will match character's a through c, [2468] will match the the numbers 2, 4, 6, and 8, etc.

Short-hand patterns like this are a big part of why regex's can be so powerful.

Ignore Case

Did you notice that we didn't grab the capitol i in the last example? Here's how to fix that:

console.log('I\'m Batman!'.match(/[aeiou]/gi))  
[ 'I', 'a', 'a' ]

Adding the i after the / tells JavaScript that our regex ignores case. It'll match upper or lower case.

By the way, it doesn't matter if we do gi or ig. We'd get the same result.

We can also just use i on it's lonesome:

console.log('I\'m Batman!'.match(/[aeiou]/i))  
[ 'I', index: 0, input: 'I\'m Batman!' ]

And since we're not using g, we're back to that weird object/array thingy that only gets the first match.

Ok, that's it for part 1. Onto part 2!

Looking for a software developer?