Add Columns to an empty data frame in R

Michal picture Michal · Oct 31, 2014 · Viewed 25k times · Source

I have searched extensively but not found an answer to this question on Stack Overflow.

Lets say I have a data frame a.

I define:

a <- NULL
a <- as.data.frame(a)

If I wanted to add a column to this data frame as so:

a$col1 <- c(1,2,3)

I get the following error:

Error in `$<-.data.frame`(`*tmp*`, "a", value = c(1, 2, 3)) : 
    replacement has 3 rows, data has 0

Why is the row dimension fixed but the column is not?

How do I change the number of rows in a data frame?

If I do this (inputting the data into a list first and then converting to a df), it works fine:

a <- NULL
a$col1 <- c(1,2,3)
a <- as.data.frame(a)

Answer

ctbrown picture ctbrown · Nov 1, 2014

The row dimension is not fixed, but data.frames are stored as list of vectors that are constrained to have the same length. You cannot add col1 to a because col1 has three values (rows) and a has zero, thereby breaking the constraint. R does not by default auto-vivify values when you attempt to extend the dimension of a data.frame by adding a column that is longer than the data.frame. The reason that the second example works is that col1 is the only vector in the data.frame so the data.frame is initialized with three rows.

If you want to automatically have the data.frame expand, you can use the following function:

cbind.all <- function (...) 
{
    nm <- list(...)
    nm <- lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow))
    do.call(cbind, lapply(nm, function(x) rbind(x, matrix(, n - 
        nrow(x), ncol(x)))))
}

This will fill missing values with NA. And you would use it like: cbind.all( df, a )