How to use column defaults with dbWriteTable(..., append=TRUE) in RPostgreSQL

Zach picture Zach · Mar 3, 2014 · Viewed 12.6k times · Source

The dbWriteTable function in RPostgreSQL seems to ignore column names and tries to push data from R to PostgreSQL as-is. This is problematic when appending to existing tables, particularly if there are columns un-specified in the R object that should be given default values.

RMySQL handles this case very gracefully by adding the column names to LOAD DATA LOCAL INFILE. How do I force RPostgreSQL to assign default values to un-specified columns in dbWriteTable when append=TRUE?

Here is an example:

CREATE TABLE test (
    column_a varchar(255) not null default 'hello',
    column_b integer not null 
    );
insert into test values (DEFAULT, 1);

Which yields the following table:

select * from test;
 column_a | column_b 
----------+----------
 hello    |        1
(1 row)

I want to insert some new data to this table from R:

require('RPostgreSQL')
driver <- PostgreSQL()
con <- dbConnect(driver, host='localhost', dbname='development')
set.seed(42)
x <- data.frame(column_b=sample(1:100, 10))
dbWriteTable(con, name='test', value=x, append=TRUE, row.names=FALSE)
dbDisconnect(con)

But I get the following error:

Error in postgresqlgetResult(new.con) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  missing data for 
column "column_b"
CONTEXT:  COPY test, line 1: "92"
)

This is because I have not specified the column_a field, so dbWriteTable is trying to write the data for column_b into column_a. I would like to force dbWriteTable to use the defaults for column_a, and properly write column_b to column_b.

I should only get a failure if:

  1. I fail to specify a column with no default value
  2. I try to insert a column that doesn't exist in the table
  3. I insert the wrong datatype into an existing column

Answer

Alex picture Alex · May 20, 2014

I had exactly the same problem, this fixed it.

Check out the dbWriteTable2 function from package caroline.

The code then allows you to write a data frame without an id column into the database using add_id = TRUE, e.g.

dbWriteTable2(con_psql,"domains",data_domains,append=TRUE,overwrite=FALSE,row.names=FALSE,add.id=TRUE)