Replacing the specific values in columns of data frame using gsub in R

Carol picture Carol · Aug 17, 2015 · Viewed 27.1k times · Source

I have data.frame as follows

> df
ID      Value
A_001   DEL-1:7:35-8_1 
A_002   INS-4l:5_74:d
B_023   0 
C_891   2
D_787   8
E_865   DEL-3:65:1s:b

I would like replace all the values in the column Value that starts with DEL and INS with nothing. I mean i would like get the output as follows

> df
ID      Value
A_001   
A_002   
B_023   0 
C_891   2
D_787   8
E_865   

I tried to achieve this using gsub in R using following code but it didnt work

gsub(pattern="(^([DEL|INS]*)",replacement="",df)

Could anyone guide me how to achieve the desired output.

Thanks in advance.

Answer

Avinash Raj picture Avinash Raj · Aug 17, 2015

Just remove the character class and add .* next to that group. sub alone would do this job.

df$value <- sub("^(DEL|INS).*", "", df$value)

Inside a character class, each char would be treated speartely not as a whole string. So [DEL] would match a single character from the given list, it may be D or E or L .