risks of using setwd() in a script?

Ricardo Saporta picture Ricardo Saporta · Dec 7, 2012 · Viewed 10.6k times · Source

I've heard it said that it is bad practice to use setwd() in a script.

  • What are the risks/dangers associated with it?
  • What are better alternatives?

Answer

Ben Bolker picture Ben Bolker · Dec 7, 2012

It's an issue of reproducible code. If you specify a directory that doesn't exist on someone else's computer, then they can't use your code. This is particularly bad with absolute file paths, and particularly bad with Windows file paths (which are absolutely impossible to replicate on a Unix system).

My preferred solution is to specify that the user should be in the relevant directory on their own system before starting to run the code. If for your own convenience you want to put a setwd(...) right at the top of your code, where other people can notice it and comment it out as appropriate, but the rest of your code assumes only relative paths from that starting directory, that's OK with me.

Yihui Xie (author of knitr) feels particularly strongly about this:

https://groups.google.com/forum/?fromgroups=#!topic/knitr/knM0VWoexT0

Whenever you want to manipulate files, they are assumed to be under the same directory of your source (e.g. Rnw documents). Then you can always use relative paths and you will never need to setwd(). Using setwd() contradicts with the principle of reproducibility, e.g. you use setwd('foo/bar/') and the directory may not exist in other people's computers. See FAQ 7: https://github.com/yihui/knitr/blob/master/FAQ.md

And from the aforementioned FAQ 7:

You'd better not do this [change working directory inside knitr code chunks]. Your working directory is always getwd() (all output files will be written here), but the code chunks are evaluated under the directory where your input document comes from. Changing working directories while running R code is a bad practice in general. See #38 for a discussion. You should also try to avoid absolute directories whenever possible (use relative directories instead), because it makes things less reproducible.

See also: https://github.com/yihui/knitr/issues/38