Delete Duplicate rows in Vertica database

Amit Verma picture Amit Verma · Jun 19, 2013 · Viewed 9.7k times · Source

Vertica allows duplicates to be inserted into the tables. I can view those using the 'analyze_constraints' function. How to delete duplicate rows from Vertica tables?

Answer

mauro picture mauro · Dec 28, 2015

You should try to avoid/limit using DELETE with a large number of records. The following approach should be more effective:

Step 1 Create a new table with the same structure / projections as the one containing duplicates:

create table mytable_new like mytable including projections ;

Step 2 Insert into this new table de-duplicated rows:

insert /* +direct */ into mytable_new select <column list> from (
    select * , row_number() over ( partition by <pk column list> ) as rownum from <table-name>
) a where a.rownum = 1 ;

Step 3 rename the original table (the one containing dups):

alter table mytable rename to mytable_orig ;

Step 4 rename the new table:

alter table mytable_new rename to mytable ;

That's all.