NOT EXISTS clause in Postgresql

cheng picture cheng · Jun 28, 2012 · Viewed 41.6k times · Source

Anyone knows how to perform such query in Postgresql?

SELECT * 
FROM tabA 
WHERE NOT EXISTS (
    SELECT * 
    FROM tabB 
    WHERE tabB.id = tabA.id
)

When I execute such query, postgresql complains "ERROR: Greenplum Database does not yet support that query."

EDIT: And how about this one:

SELECT * 
FROM tabA 
WHERE NOT EXISTS (
    SELECT * 
    FROM tabB WHERE tabB.id = tabA.id AND tabB.id2 = tabA.id2
)

EDIT:
I tested in postgresql 8.2.15 for the 4 answers provided by @ypercube. Conclusions are:

1) The first does not work in this version of postgresql, as I said above in the question. The error message can be found there too.

2) For the other three answers, the execution speed is: (3)LEFT JOIN > (4)EXCEPT >> (2)NOT IN.
Specifically, for queries that have the same syntax, (3)LEFT JOIN takes about 5580ms, (4)EXCEPT takes about 13502ms, and (2)NOT IN takes more than 100000 (In fact I did not wait util it finished).
Is there any particular reasons for NOT IN clause to be so slow?
Cheng

Answer

ypercubeᵀᴹ picture ypercubeᵀᴹ · Jun 28, 2012

There are 3 (main) ways to do this kind of query:

  1. NOT EXISTS correlated subquery

  2. NOT IN subquery

  3. LEFT JOIN with IS NULL check:

You found that the first way does work in Greenplum. @Marco and @juergen provided the 2nd way. Here's the 3rd one, it may bypass Greenplum's limitations:

SELECT tabA.* 
FROM 
    tabA 
  LEFT JOIN 
    tabB 
      ON  tabB.id = tabA.id 
      AND tabB.id2 = tabA.id2
WHERE tabB.id IS NULL ;

This (4th way) also works in Postgres (which supports EXCEPT operator):

SELECT a.*
FROM a
WHERE id IN
      ( SELECT id
        FROM a
      EXCEPT
        SELECT id
        FROM b
      ) ; 

Tested in SQL-Fiddle (that all 4 work in Postgres).