I need to rbind two large data frames. Right now I use
df <- rbind(df, df.extension)
but I (almost) instantly run out of memory. I guess its because df is held in the memory twice. I might see even bigger data frames in the future, so I need some kind of in-place rbind.
So my question is: Is there a way to avoid data duplication in memory when using rbind?
I found this question, which uses SqlLite, but I really want to avoid using the hard drive as a cache.
data.table
is your friend!
C.f. http://www.mail-archive.com/[email protected]/msg175877.html
Following up on nikola's comment, here is ?rbindlist
's description (new in v1.8.2) :
Same as
do.call("rbind",l)
, but much faster.