Why is using assign bad?

asb picture asb · Jul 10, 2013 · Viewed 9.2k times · Source

This post (Lazy evaluation in R – is assign affected?) covers some common ground but I am not sure it answers my question.

I stopped using assign when I discovered the apply family quite a while back, albeit, purely for reasons of elegance in situations such as this:

names.foo <- letters
values.foo <- LETTERS
for (i in 1:length(names.foo))
  assign(names.foo[i], paste("This is: ", values.foo[i]))

which can be replaced by:

foo <- lapply(X=values.foo, FUN=function (k) paste("This is :", k))
names(foo) <- names.foo

This is also the reason this (http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f) R-faq says this should be avoided.

Now, I know that assign is generally frowned upon. But are there other reasons I don't know? I suspect it may mess with the scoping or lazy evaluation but I am not sure? Example code that demonstrates such problems will be great.

Answer

IRTFM picture IRTFM · Jul 10, 2013

Actually those two operations are quite different. The first gives you 26 different objects while the second gives you only one. The second object will be a lot easier to use in analyses. So I guess I would say you have already demonstrated the major downside of assign, namely the necessity of then needing always to use get for corralling or gathering up all the similarly named individual objects that are now "loose" in the global environment. Try imagining how you would serially do anything with those 26 separate objects. A simple lapply(foo, func) will suffice for the second strategy.

That FAQ citation really only says that using assignment and then assigning names is easier, but did not imply it was "bad". I happen to read it as "less functional" since you are not actually returning a value that gets assigned. The effect looks to be a side-effect (and in this case the assign strategy results in 26 separate side-effects). The use of assign seems to be adopted by people that are coming from languages that have global variables as a way of avoiding picking up the "True R Way", i.e. functional programming with data-objects. They really should be learning to use lists rather than littering their workspace with individually-named items.

There is another assignment paradigm that can be used:

 foo <- setNames(  paste0(letters,1:26),  LETTERS)

That creates a named atomic vector rather than a named list, but the access to values in the vector is still done with names given to [.