Copy in Postgres from a tab delimited file to table

parchambeau picture parchambeau · Jan 28, 2013 · Viewed 10.5k times · Source

I am working on a copy command to copy the contents of a tab delimited file with a header to a table in postgres. I have used copy before so I know how it works but I have a question regarding how I can get around some data being missing in the file.

I have premade the table to have the same column names as the values in the header.

Some of the "columns" from the file that I received are just blank. I have put fake data in just to get the command working but this is something that is going to be automated weekly. I was wondering if I am just going to have to tell the people to make sure their data is correct or if there is a way to tell the copy command to input a NULL in the place of blank data.

I thought I might have to place in blank tabs and that would act as a null but I wasn't sure if that was the best course of action.

Answer

vyegorov picture vyegorov · Jan 28, 2013

Documentation for COPY specifies, that default value for NULLs is \N. So changing it to be empty as @a_horse_with_no_name suggest is the way to go.

Another way is to preprocess the file using any other tools. I quite often do the following in my scripts:

psql dbname <<EOSQL
COPY tab FROM stdin WITH (FORMAT 'csv', DELIMITER E'\t', NULL 'NULL');
$(printf "1\t2\t3\n4\t\t6\n7\t8\t\n\t10\t11\n"| \
  sed -e 's/\(\t\|^\)\(\t\|$\)/\1NULL\2/g')
\.
EOSQL

Instead of printf you can cat your file. Use whatever manipulations required. Note: I'm on OSX, so sed syntax may vary for you.

Also, in the current PostgreSQL CommitFest there's a patch to add generic Pre- and Post- processors to the COPY command. Here you can find the whole discussion.