Let's assume that we have a data frame x
which contains the columns job
and income
. Referring to the data in the frame normally requires the commands x$job
for the data in the job
column and x$income
for the data in the income
column.
However, using the command attach(x)
permits to do away with the name of the data frame and the $
symbol when referring to the same data. Consequently, x$job
becomes job
and x$income
becomes income
in the R code.
The problem is that many experts in R advise NOT to use the attach()
command when coding in R.
What is the main reason for that? What should be used instead?
When to use it:
I use attach()
when I want the environment you get in most stats packages (eg Stata, SPSS) of working with one rectangular dataset at a time.
When not to use it:
However, it gets very messy and code quickly becomes unreadable when you have several different datasets, particularly if you are in effect using R as a crude relational database, where different rectangles of data, all relevant to the problem at hand and perhaps being used in various ways of matching data from the different rectangles, have variables with the same name.
The with()
function, or the data=
argument to many functions, are excellent alternatives to many instances where attach()
is tempting.