reshape vs. reshape2 in R

Alex picture Alex · Sep 11, 2012 · Viewed 19.5k times · Source

I am attempting to understand why development had shifted from reshape to reshape2 package. They seem to be functionally the same, however, I am unable to upgrade to reshape2 currently due to an older version of R running on the server. I am concerned about the possibility of a major bug that would have shifted development to a whole new package instead of simply continuing development of reshape. Does anyone know if there is a major flaw in the reshape package?

Answer

Matt Parker picture Matt Parker · Sep 12, 2012

reshape2 let Hadley make a rebooted reshape that was way, way faster, while avoiding busting up people's dependencies and habits.

https://stat.ethz.ch/pipermail/r-packages/2010/001169.html

Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focussed and much much faster.

This version improves speed at the cost of functionality, so I have renamed it to reshape2 to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features.

What's new in reshape2:

  • considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data.

  • cast is replaced by two functions depending on the output type: dcast produces data frames, and acast produces matrices/arrays.

  • multidimensional margins are now possible: grand_row and grand_col have been dropped: now the name of the margin refers to the variable that has its value set to (all).

  • some features have been removed such as the | cast operator, and the ability to return multiple values from an aggregation function. I'm reasonably sure both these operations are better performed by plyr.

  • a new cast syntax which allows you to reshape based on functions
    of variables (based on the same underlying syntax as plyr):

  • better development practices like namespaces and tests.