Remove all text before colon

r unix replace sed awk

Elb · Sep 6, 2012 · Viewed 77.1k times · Source

I have a file containing a certain number of lines. Each line looks like this:

TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1

I would like to remove all before ":" character in order to retain only PKMYT1 that is a gene name. Since I'm not an expert in regex scripting can anyone help me to do this using Unix (sed or awk) or in R?

Answer

Here are two ways of doing it in R:

foo <- "TF_list_to_test10004/Nus_k0.345_t0.1_e0.1.adj:PKMYT1"

# Remove all before and up to ":":
gsub(".*:","",foo)

# Extract everything behind ":":
regmatches(foo,gregexpr("(?<=:).*",foo,perl=TRUE))

Remove all text before colon

Answer

Related questions