Replacing commas and dots in R

Nils Olve picture Nils Olve · Jan 9, 2014 · Viewed 72.1k times · Source

I have a whole column of numbers that include dot separators at the thousands and comma instead of dot as an dismal separator. When I try to create a numeric column out of them, I lose all data.

var1 <- c("50,0", "72,0", "960,0", "1.920,0", "50,0", "50,0", "960,0")
df <- cbind(var1, var2 = as.numeric(gsub(".", "", as.character(var1))))

and wound up with:

 var1      var2
[1,] "50,0"    NA  
[2,] "72,0"    NA  
[3,] "960,0"   NA  
[4,] "1.920,0" NA  
[5,] "50,0"    NA  
[6,] "50,0"    NA  
[7,] "960,0"   NA 

What am I doing wrong?

Answer

Joshua Ulrich picture Joshua Ulrich · Jan 9, 2014

You need to escape the "." in your regular expression, and you need to replace the commas with a "." before you can convert to numeric.

> as.numeric(gsub(",", ".", gsub("\\.", "", var1)))
[1]   50   72  960 1920   50   50  960