How do I split a string by whitespace and ignoring leading and trailing whitespace into an array of words using a regular expression?

natlee75 picture natlee75 · Feb 16, 2013 · Viewed 63.1k times · Source

I typically use the following code in JavaScript to split a string by whitespace.

"The quick brown fox jumps over the lazy dog.".split(/\s+/);
// ["The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog."]

This of course works even when there are multiple whitespace characters between words.

"The  quick brown fox     jumps over the lazy   dog.".split(/\s+/);
// ["The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog."]

The problem is when I have a string that has leading or trailing whitespace in which case the resulting array of strings will include an empty character at the beginning and/or end of the array.

"  The quick brown fox jumps over the lazy dog. ".split(/\s+/);
// ["", "The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog.", ""]

It's a trivial task to eliminate such empty characters, but I'd rather take care of this within the regular expression if that's at all possible. Does anybody know what regular expression I could use to accomplish this goal?

Answer

kennebec picture kennebec · Feb 16, 2013

If you are more interested in the bits that are not whitespace, you can match the non-whitespace instead of splitting on whitespace.

"  The quick brown fox jumps over the lazy dog. ".match(/\S+/g);

Note that the following returns null:

"   ".match(/\S+/g)

So the best pattern to learn is:

str.match(/\S+/g) || []