Load a dataset into R with data() using a variable instead of the dataset name

pazof picture pazof · Nov 11, 2013 · Viewed 63.7k times · Source

I am trying to load a dataset into R using the data() function. It works fine when I use the dataset name (e.g. data(Titanic) or data("Titanic")). What doesn't work for me is loading a dataset using a variable instead of its name. For example:

# This works fine:
> data(Titanic)

# This works fine as well:
> data("Titanic")

# This doesn't work:
> myvar <- Titanic
> data(myvar)
**Warning message:
In data(myvar) : data set ‘myvar’ not found**

Why is R looking for a dataset named "myvar" since it is not quoted? And since this is the default behavior, isn't there a way to load a dataset stored in a variable?

For the record, what I am trying to do is to create a function that uses the "arules" package and mines association rules using Apriori. Thus, I need to pass the dataset as a parameter to that function.

myfun <- function(mydataset) {
    data(mydataset)    # doesn't work (data set 'mydataset' not found)
    rules <- apriori(mydataset)
}

edit - output of sessionInfo():

> sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arules_1.0-14   Matrix_1.0-12   lattice_0.20-15 RPostgreSQL_0.4 DBI_0.2-7      

loaded via a namespace (and not attached):
[1] grid_3.0.0  tools_3.0.0

And the actual errors I am getting (using, for example, a sample dataset "xyz"):

xyz <- data.frame(c(1,2,3))
data(list=xyz)
Warning messages:
1: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used
2: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used
3: In if (name %in% names(rds)) { :
  the condition has length > 1 and only the first element will be used
4: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used
5: In if (name %in% names(rds)) { :
  the condition has length > 1 and only the first element will be used
6: In grep(name, files, fixed = TRUE) :
  argument 'pattern' has length > 1 and only the first element will be used

...

...

32: In data(list = xyz) :
  c("data set ‘1’ not found", "data set ‘2’ not found", "data set ‘3’ not found")

Answer

Aaron left Stack Overflow picture Aaron left Stack Overflow · Nov 11, 2013

Use the list argument. See ?data.

data(list=myvar)

You'll also need myvar to be a character string.

myvar <- "Titanic"

Note that myvar <- Titanic only worked (I think) because of the lazy loading of the Titanic data set. Most datasets in packages are loaded this way, but for other kinds of data sets, you'd still need the data command.