Bulk insert, update if on conflict (bulk upsert) on Postgres

MK Yung picture MK Yung · Dec 29, 2015 · Viewed 27k times · Source

I am writing a data-mining program, which bulk inserts user data.

The current SQL is just a plain bulk insert:

insert into USERS(
    id, username, profile_picture)
select unnest(array['12345']),
    unnest(array['Peter']),
    unnest(array['someURL']),
on conflict (id) do nothing;

How do I do an update if on conflict? I tried:

...
    unnest(array['Peter']) as a,
    unnest(array['someURL']) as b,
on conflict (id) do 
update set
    username = a,
    profile_picture = b;

But it throws There is a column named "a" in table "*SELECT*", but it cannot be referenced from this part of the query. error.

EDIT:

Table of USERS is very simple:

create table USERS (
    id      text not null primary key,
    username    text,
    profile_picture text
);

Answer

MK Yung picture MK Yung · Dec 30, 2015

Turns out a special table named excluded contains the row-to-be-inserted (strange name though)

insert into USERS(
    id, username, profile_picture)
select unnest(array['12345']),
    unnest(array['Peter']),
    unnest(array['someURL'])
on conflict (id) do 
update set
    username = excluded.username,
    profile_picture = excluded.profile_picture;

http://www.postgresql.org/docs/9.5/static/sql-insert.html#SQL-ON-CONFLICT

The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the existing row using the table's name (or an alias), and to rows proposed for insertion using the special excluded table...