Replace characters in column names gsub

yake84 picture yake84 · Sep 24, 2016 · Viewed 34.5k times · Source

I am reading in a bunch of CSVs that have stuff like "sales - thousands" in the title and come into R as "sales...thousands". I'd like to use a regular expression (or other simple method) to clean these up.

I can't figure out why this doesn't work:

#mock data
  a <- data.frame(this.is.fine = letters[1:5],
                  this...one...isnt = LETTERS[1:5])

#column names
  colnames(a)
  # [1] "this.is.fine"  "this...one...isnt"

#function to remove multiple spaces
  colClean <- function(x){
    colnames(x) <- gsub("\\.\\.+", ".", colnames(x))
  }

#run function
  colClean(a)

#names go unaffected
  colnames(a)
  # [1] "this.is.fine"  "this...one...isnt"

but this code does:

#direct change to names
  colnames(a) <- gsub("\\.\\.+", ".", colnames(a))

#new names
  colnames(a)
  # [1] "this.is.fine"  "this.one.isnt"

Note that I'm fine leaving one period between words when that occurs.

Thank you.

Answer

Rajnish Kumar picture Rajnish Kumar · Mar 22, 2017
names(a) <- gsub(x = names(a), pattern = "\\.", replacement = "#")  

you can use gsub function to replace . with another special character like #.