Anyone knows how to perform such query in Postgresql?
SELECT *
FROM tabA
WHERE NOT EXISTS (
SELECT *
FROM tabB
WHERE tabB.id = tabA.id
)
When I execute such query, postgresql complains "ERROR: Greenplum Database does not yet support that query
."
EDIT: And how about this one:
SELECT *
FROM tabA
WHERE NOT EXISTS (
SELECT *
FROM tabB WHERE tabB.id = tabA.id AND tabB.id2 = tabA.id2
)
EDIT:
I tested in postgresql 8.2.15 for the 4 answers provided by @ypercube. Conclusions are:
1) The first does not work in this version of postgresql, as I said above in the question. The error message can be found there too.
2) For the other three answers, the execution speed is: (3)LEFT JOIN > (4)EXCEPT >> (2)NOT IN.
Specifically, for queries that have the same syntax, (3)LEFT JOIN takes about 5580ms, (4)EXCEPT takes about 13502ms, and (2)NOT IN takes more than 100000 (In fact I did not wait util it finished).
Is there any particular reasons for NOT IN clause to be so slow?
Cheng
There are 3 (main) ways to do this kind of query:
NOT EXISTS
correlated subquery
NOT IN
subquery
LEFT JOIN
with IS NULL
check:
You found that the first way does work in Greenplum. @Marco and @juergen provided the 2nd way. Here's the 3rd one, it may bypass Greenplum's limitations:
SELECT tabA.*
FROM
tabA
LEFT JOIN
tabB
ON tabB.id = tabA.id
AND tabB.id2 = tabA.id2
WHERE tabB.id IS NULL ;
This (4th way) also works in Postgres (which supports EXCEPT
operator):
SELECT a.*
FROM a
WHERE id IN
( SELECT id
FROM a
EXCEPT
SELECT id
FROM b
) ;
Tested in SQL-Fiddle (that all 4 work in Postgres).