Memory efficient alternative to rbind - in-place rbind?

Sebastian picture Sebastian · Aug 17, 2011 · Viewed 13k times · Source

I need to rbind two large data frames. Right now I use

df <- rbind(df, df.extension)

but I (almost) instantly run out of memory. I guess its because df is held in the memory twice. I might see even bigger data frames in the future, so I need some kind of in-place rbind.

So my question is: Is there a way to avoid data duplication in memory when using rbind?

I found this question, which uses SqlLite, but I really want to avoid using the hard drive as a cache.

Answer

Ari B. Friedman picture Ari B. Friedman · Aug 18, 2012

data.table is your friend!

C.f. http://www.mail-archive.com/[email protected]/msg175877.html


Following up on nikola's comment, here is ?rbindlist's description (new in v1.8.2) :

Same as do.call("rbind",l), but much faster.