Filter rows which contain a certain string

Gianluca picture Gianluca · Apr 4, 2014 · Viewed 373.9k times · Source

I have to filter a data frame using as criterion those row in which is contained the string RTB.

I'm using dplyr.

d.del <- df %>%
  group_by(TrackingPixel) %>%
  summarise(MonthDelivery = as.integer(sum(Revenue))) %>%
  arrange(desc(MonthDelivery))

I know I can use the function filter in dplyr but I don't exactly how to tell it to check for the content of a string.

In particular I want to check the content in the column TrackingPixel. If the string contains the label RTB I want to remove the row from the result.

Answer

alex23lemm picture alex23lemm · Jul 18, 2014

The answer to the question was already posted by the @latemail in the comments above. You can use regular expressions for the second and subsequent arguments of filter like this:

dplyr::filter(df, !grepl("RTB",TrackingPixel))

Since you have not provided the original data, I will add a toy example using the mtcars data set. Imagine you are only interested in cars produced by Mazda or Toyota.

mtcars$type <- rownames(mtcars)
dplyr::filter(mtcars, grepl('Toyota|Mazda', type))

   mpg cyl  disp  hp drat    wt  qsec vs am gear carb           type
1 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4      Mazda RX4
2 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  Mazda RX4 Wag
3 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1 Toyota Corolla
4 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  Toyota Corona

If you would like to do it the other way round, namely excluding Toyota and Mazda cars, the filter command looks like this:

dplyr::filter(mtcars, !grepl('Toyota|Mazda', type))