Extract string before "|"

Shounak Chakraborty picture Shounak Chakraborty · Jul 10, 2016 · Viewed 47.5k times · Source

I have a data set wherein a column looks like this:

ABC|DEF|GHI,  
ABCD|EFG|HIJK,  
ABCDE|FGHI|JKL,  
DEF|GHIJ|KLM,  
GHI|JKLM|NO|PQRS,  
BCDE|FGHI|JKL  

.... and so on

I need to extract the characters that appear before the first | symbol.

In Excel, we would use a combination of MID-SEARCH or a LEFT-SEARCH, R contains substr().

The syntax is - substr(x, <start>,<stop>)

In my case, start will always be 1. For stop, we need to search by |. How can we achieve this? Are there alternate ways to do this?

Answer

akrun picture akrun · Jul 10, 2016

We can use sub

sub("\\|.*", "", str1)
#[1] "ABC"

Or with strsplit

strsplit(str1, "[|]")[[1]][1]
#[1] "ABC"

Update

If we use the data from @hrbrmstr

sub("\\|.*", "", df$V1)
#[1] "ABC"   "ABCD"  "ABCDE" "DEF"   "GHI"   "BCDE" 

These are all base R methods. No external packages used.

data

str1 <- "ABC|DEF|GHI ABCD|EFG|HIJK ABCDE|FGHI|JKL DEF|GHIJ|KLM GHI|JKLM|NO|PQRS BCDE|FGHI|JKL"