How to remove + (plus sign) from string in R?

Jonathan picture Jonathan · Mar 5, 2016 · Viewed 14.4k times · Source

Say I use gsub and want to remove the following (=,+,-) sign from the string and replace with an underscore.

Can someone describe what is going on when I try to use the gsub with a plus sign (+).

test<- "sandwich=bread-mustard+ketchup"
# [1] "sandwich=bread-mustard+ketchup"

test<-gsub("-","_",test)
# [1] "sandwich=bread_mustard+ketchup"

test<-gsub("=","_",test)
# [1] "sandwich_bread_mustard+ketchup"

test<-gsub("+","_",test)
#[1] "_s_a_n_d_w_i_c_h___b_r_e_a_d___m_u_s_t_a_r_d_+_k_e_t_c_h_u_p_"

Answer

coffeinjunky picture coffeinjunky · Mar 5, 2016

Try

test<- "sandwich=bread-mustard+ketchup"
test<-gsub("\\+","_",test)
test
[1] "sandwich=bread-mustard_ketchup"

+ is a special character. You need to escape it. Same as, for instance, .. If you google regex or regular expressions, you will find the corresponding lists of special characters. For instance, here + is described to indicate 1 or more of previous expression. More about special characters, regular expressions and R can be found here or here.

On a more general note, your above code could be written more efficiently by using:

 test<- "sandwich=bread-mustard+ketchup"
 test<-gsub("[-|=|\\+]","_",test)
 test
 [1] "sandwich_bread_mustard_ketchup"

Here I have used a construct that can basically be read as [either this or that or something else], where | corresponds to or.