Find duplicate rows with PostgreSQL

schlubbi picture schlubbi · Jan 23, 2013 · Viewed 60.7k times · Source

We have a table of photos with the following columns:

id, merchant_id, url 

this table contains duplicate values for the combination merchant_id, url. so it's possible that one row appears more several times.

234 some_merchant  http://www.some-image-url.com/abscde1213
235 some_merchant  http://www.some-image-url.com/abscde1213
236 some_merchant  http://www.some-image-url.com/abscde1213

What is the best way to delete those duplications? (I use PostgreSQL 9.2 and Rails 3.)

Answer

MatthewJ picture MatthewJ · Jan 23, 2013

Here is my take on it.

select * from (
  SELECT id,
  ROW_NUMBER() OVER(PARTITION BY merchant_Id, url ORDER BY id asc) AS Row
  FROM Photos
) dups
where 
dups.Row > 1

Feel free to play with the order by to tailor the records you want to delete to your specification.

SQL Fiddle => http://sqlfiddle.com/#!15/d6941/1/0


SQL Fiddle for Postgres 9.2 is no longer supported; updating SQL Fiddle to postgres 9.3