MySql. How to use Self Join

hank99 picture hank99 · May 22, 2013 · Viewed 30k times · Source

I need to use Self Join on this table.

+------------+------+--------+
| Country    | Rank |  Year  |
+------------+------+--------+
|France      |  55  |  2000  |
+------------+------+--------+
|Canada      |  30  |  2000  |
+------------+------+--------+ 
|Liberia     |  59  |  2001  |
+------------+------+--------+ 
|Turkey      |  78  |  2000  |
+------------+------+--------+ 
|Japan       |  65  |  2003  |
+------------+------+--------+
|Romania     |  107 |  2001  |
+------------+------+--------+

I need to use self join to get what countries has the same year as Turkey. Display the Country and year only.

This is what I am trying to do.

SELECT DISTINCT a.Country, a.Year 
FROM table1 AS a, table1 AS b 
WHERE a.Year=b.Year and a.Country='Turkey';

^ googled self join, and made it.

I am getting only Turkey. What am I doing wrong?

Answer

xQbert picture xQbert · May 22, 2013

You're so close!

Since you say you're displaying the country and year from A and limiting by A. Country of Turkey, Turkey is all you're going to see. You either need to change the selects to be B.country and B.year or change the where clause to be B.country.

This is using a cross join which will get slower the more records there are in a table.

SELECT DISTINCT b.Country, b.Year 
FROM table1 AS a, 
     table1 AS b 
WHERE a.Year=b.Year 
  and a.Country='Turkey';

could be written as... and would likely have the same execution plan.

SELECT DISTINCT b.Country, b.Year 
FROM table1 AS a 
CROSS JOIN table1 AS b 
WHERE a.Year=b.Year 
  and a.Country='Turkey';

OR This uses an INNER JOIN which limits the work the engine must do and doesn't suffer from performance degradation that a cross join would.

SELECT DISTINCT a.Country, a.Year 
FROM table1 AS a 
INNER JOIN table1 AS b 
   on a.Year=b.Year 
  and b.Country='Turkey';

WHY:

Consider what the SQL engine will do when the join occurs A B

+------------+------+--------+------------+------+--------+
| A.Country  | Rank |  Year  | B.Country  | Rank |  Year  |
+------------+------+--------+------------+------+--------+
|France      |  55  |  2000  |France      |  55  |  2000  |
+------------+------+--------+------------+------+--------+
|Canada      |  30  |  2000  |France      |  55  |  2000  |
+------------+------+--------+------------+------+--------+ 
|Turkey      |  78  |  2000  |France      |  55  |  2000  |
+------------+------+--------+------------+------+--------+ 
|France      |  55  |  2000  |Canada      |  30  |  2000  |
+------------+------+--------+------------+------+--------+
|Canada      |  30  |  2000  |Canada      |  30  |  2000  |
+------------+------+--------+------------+------+--------+ 
|Turkey      |  78  |  2000  |Canada      |  30  |  2000  |
+------------+------+--------+------------+------+--------+ 
|France      |  55  |  2000  |Turkey      |  78  |  2000  |
+------------+------+--------+------------+------+--------+
|Canada      |  30  |  2000  |Turkey      |  78  |  2000  |
+------------+------+--------+------------+------+--------+ 
|Turkey      |  78  |  2000  |Turkey      |  78  |  2000  |
+------------+------+--------+------------+------+--------+ 

So when you said display A.Country and A.Year where A.Country is Turkey, you can see all it can return is Turkey (due to the distinct only 1 record)

But if you do B.Country is Turkey and display A.Country, you'll get France, Canada and Turkey!