SQL left outer join on multiple columns

François M. picture François M. · Oct 13, 2016 · Viewed 48.5k times · Source

According to this SQL join cheat-sheet, a left outer join on one column is the following :

SELECT *
  FROM a
  LEFT JOIN b 
    ON a.foo = b.foo
  WHERE b.foo IS NULL 

I'm wondering what it would look like with a join on multiple columns, should it be an OR or an AND in the WHERE clause ?

SELECT *
  FROM a
  LEFT JOIN b 
    ON  a.foo = b.foo
    AND a.bar = b.bar
    AND a.ter = b.ter
WHERE b.foo IS NULL 
  OR  b.bar IS NULL 
  OR  b.ter IS NULL

or

SELECT *
  FROM a
  LEFT JOIN b 
    ON  a.foo = b.foo
    AND a.bar = b.bar
    AND a.ter = b.ter
WHERE b.foo IS NULL 
  AND b.bar IS NULL 
  AND b.ter IS NULL

?

(I don't think it does, but in case it matters, the db engine is Vertica's)

(I'm betting on the OR one)

Answer

Cristian Lupascu picture Cristian Lupascu · Oct 13, 2016

That depends on whether the columns are nullable, but assuming they are not, checking any of them will do:

SELECT *
  FROM a
  LEFT JOIN b 
    ON  a.foo = b.foo
    AND a.bar = b.bar
    AND a.ter = b.ter
WHERE b.foo IS NULL -- this could also be bar or ter

This is because after a successful join, all three columns will have a non-null value.

If some of these columns were nullable and you'd like to check if any one of them had a value after the join, then your first (OR) approach would be OK.