Merge two tables to one and remove duplicates

angs picture angs · Jun 17, 2015 · Viewed 11.9k times · Source

I have 2 tables in the same database.

I want to merge them based on the common id column. Because the tables are too huge I am not sure if there are duplicates.

How is it possible to merge these two tables into one based on the id and be sure that there are no duplicates?

SELECT *
FROM table1,table2
JOIN
GROUP BY id

Answer

Proxima picture Proxima · Jun 17, 2015

What do you mean by merging two tables? Do you want records and columns from both the tables or columns from one and records from both? Either way you will need to change the join clause only.

You could do a join on the columns you wish to

SELECT DISTINCT * 
  FROM table1 tb1
  JOIN table2 tb2 
    ON table1.id = table2.id 

Now if you want columns from only table1 do a LEFT JOIN If you want columns from only table2 then a RIGHT JOIN If you want columns from both the tables, use the query as is. DISTINCT ensures that you get only a single row if there are multiple rows with the same data (but this distinct will check values for all columns in a row whether they are different or the same) Union won't help if both tables have different number of columns. If you don't know about joins then use a Cartesian product

select distinct * 
  from table1 tb1, table2 tb2 
 where tb1.id = tb2.id

Where id is the column that is common between the tables. Here if you want columns from only table1 do

select distinct tb1.* 

Similarly replace tb1 by tb2 in the above statement if you just want table2 columns.

select distinct tb2.*

If you want cols from both just write '*'

In either cases I.e. joins and products said above if you need selective columns just write a table alias. E.g. Consider : table1 has id, foo, bar as columns table2 has id, name,roll no, age

you want only id, foo, name from both the tables in the select query result do this:

select distinct tb1.id, tb1.foo, tb2.name 
  from table1 tb1 
  join table2 tb2 
    on tb1.id=tb2.id

Same goes for the Cartesian product query. tb1, tb2 are BTW called as a table aliases.

If you want data from both the tables even if they have nothing in common just do

select distinct * 
  from table1 , table2

Note that this cannot be achieved using a join as join requires a common column to join 'on'