Is it bad practice to access S4 objects slots directly using @?

r s4
Pierre picture Pierre · Mar 28, 2012 · Viewed 7.4k times · Source

This one is almost a philosophical question: is it bad to access and/or set slots of S4 objects directly using @?

I have always been told it was bad practice, and that users should use "accessor" S4 methods, and that developers should provide their users with these. But I'd like to know if anybody knows the real deal behind this?

Here's an example using the sp package (but could be generalised for any S4 class):

> library(sp)
> foo <- data.frame(x = runif(5), y = runif(5), bar = runif(5))
> coordinates(foo) <- ~x+y
> class(foo)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"

> str(foo)
Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
  ..@ data       :'data.frame': 5 obs. of  1 variable:
  .. ..$ bar: num [1:5] 0.621 0.273 0.446 0.174 0.278
  ..@ coords.nrs : int [1:2] 1 2
  ..@ coords     : num [1:5, 1:2] 0.885 0.763 0.591 0.709 0.925 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "x" "y"
  ..@ bbox       : num [1:2, 1:2] 0.591 0.155 0.925 0.803
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2] "x" "y"
  .. .. ..$ : chr [1:2] "min" "max"
  ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slots
  .. .. ..@ projargs: chr NA

> foo@data
        bar
1 0.6213783
2 0.2725903
3 0.4458229
4 0.1743419
5 0.2779656
> foo@data <- data.frame(bar = letters[1:5], baz = runif(5))
> foo@data
  bar        baz
1   a 0.22877446
2   b 0.93206667
3   c 0.28169866
4   d 0.08616213
5   e 0.36713750

Answer

Martin Morgan picture Martin Morgan · Mar 28, 2012

In this question a stackoverflow-er asks why they can't find the end slot in a Bioconductor IRanges object; after all there are start(), width(), and end() accessors and start and width slots. The answer is because the way users interface with the class differs from how it is implemented. In this case, the implementation is driven by the simple observation that it is not space-efficient to store three values (start, end, width) when only two (which two? up to the developer!) are sufficient. Similar but deeper examples of divergence between interface and implementation are present in other S4 objects and in common S3 instances like the one returned by lm, where the data stored in the class is appropriate for subsequent calculation rather than tailored to represent the quantities that a particular user might be most interested in. Nothing good will come if you were to reach in to that lm instance and change a value, e.g., the coefficients element. This separation of interface from implementation gives the developer a lot of freedom to provide a reasonable and constant user experience, perhaps shared with other similar classes, but to implement (and to change the implementation) classes in ways that makes programming sense.

This doesn't really answer your question, I guess, but the developer is not expecting the user to directly access slots and the user should not expect direct slot access to be an appropriate way to interact with the class.