Capitalizing letters. R equivalent of excel "PROPER" function

Konrad picture Konrad · Jul 25, 2014 · Viewed 21.2k times · Source

Colleagues,

I'm looking at a data frame resembling the extract below:

Month   Provider Items
January CofCom   25
july    CofCom   331
march   vobix    12
May     vobix    0

I would like to capitalise first letter of each word and lower the remaining letters for each word. This would result in the data frame resembling the one below:

Month   Provider Items
January Cofcom   25
July    Cofcom   331
March   Vobix    12
May     Vobix    0

In a word, I'm looking for R's equivalent of the ROPER function available in the MS Excel.

Answer

Matthew Plourde picture Matthew Plourde · Jul 25, 2014

With regular expressions:

x <- c('woRd Word', 'Word', 'word words')
gsub("(?<=\\b)([a-z])", "\\U\\1", tolower(x), perl=TRUE)
# [1] "Word Word"  "Word"       "Word Words"

(?<=\\b)([a-z]) says look for a lowercase letter preceded by a word boundary (e.g., a space or beginning of a line). (?<=...) is called a "look-behind" assertion. \\U\\1 says replace that character with it's uppercase version. \\1 is a back reference to the first group surrounded by () in the pattern. See ?regex for more details.

If you only want to capitalize the first letter of the first word, use the pattern "^([a-z]) instead.