I have a data frame ("data") with lots and lots of columns. Some of the columns contain a certain string ("search_string").
How can I use dplyr::select()
to give me a subset including only the columns that contain the string?
I tried:
# columns as boolean vector
select(data, grepl("search_string",colnames(data)))
# columns as vector of column names names
select(data, colnames(data)[grepl("search_string",colnames(data))])
Neither of them work.
I know that select()
accepts numeric vectors as substitute for columns e.g.:
select(data,5,7,9:20)
But I don't know how to get a numeric vector of columns ID
s from my grepl()
expression.
Within the dplyr world, try:
select(iris,contains("Sepal"))
See the Selection section in ?select
for numerous other helpers like starts_with
, ends_with
, etc.