I have a table with the following fields:
id (Unique)
url (Unique)
title
company
site_id
Now, I need to remove rows having same title, company and site_id
. One way to do it will be using the following SQL along with a script (PHP
):
SELECT title, site_id, location, id, count( * )
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1
After running this query, I can remove duplicates using a server side script.
But, I want to know if this can be done only using SQL query.
A really easy way to do this is to add a UNIQUE
index on the 3 columns. When you write the ALTER
statement, include the IGNORE
keyword. Like so:
ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);
This will drop all the duplicate rows. As an added benefit, future INSERTs
that are duplicates will error out. As always, you may want to take a backup before running something like this...