R regex gsub separate letters and numbers

screechOwl picture screechOwl · Jul 23, 2012 · Viewed 17.3k times · Source

I have a string that's mixed letters and numbers:

"The sample is 22mg"

I'd like to split strings where a number is immediately followed by letter like this:

"The sample is 22 mg"

I've tried this:

gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg')

but am not getting the desired results.

Any suggestions?

Answer

Nicholas Riley picture Nicholas Riley · Jul 23, 2012

You need to use capturing parentheses in the regular expression and group references in the replacement. For example:

gsub('([0-9])([[:alpha:]])', '\\1 \\2', 'This is a test 22mg')

There's nothing R-specific here; the R help for regex and gsub should be of some use.