How do I avoid repeating this subquery for the IN clause?

Tomas Aschan picture Tomas Aschan · Mar 29, 2011 · Viewed 7.4k times · Source

I have an SQL script (currently running against SQLite, but it should probably work against any DB engine) that uses the same subquery twice, and since it might be fetching a lot of records (the table has a couple of million rows) I'd like to only call it once.

A shortened pseudo-version of the query looks like this:

SELECT * FROM
    ([the subquery, returns a column of ids]) AS sq
[a couple of joins, that fetches things from other tables based on the ids]
WHERE thisorthat NOT IN ([the subquery again])

I tried just using the name (sq) in various ways (with/without parenthesis, with/without naming the column of sq etc) but to no avail.

Do I really have to repeat this subquery?

Clarification: I am doing this in python and sqlite as a small demo of what can be done, but I would like my solution to scale as well as possible with as little modification as possible. In the real situation, the database will have a couple of million rows, but in my example there is just 10 rows with dummy data. Thus, code that would be well optimized on for example MySQL is absolutely good enough - it doesn't have to be optimized specifically for SQLite. But as I said, the less modification needed, the better.

Answer

Janick Bernet picture Janick Bernet · Mar 29, 2011

There is a WITH clause in standard SQL, however, I don't know if it is supported by SQLlite - though of course worth a try:

WITH mySubQuery AS
(
  [the subquery code]
)

SELECT * FROM
    mySubQuery AS sq
    [a couple of joins, that fetches things from other tables based on the ids]
WHERE thisorthat NOT IN (mySubQuery)

That said, what you do here will likely be horribly slow for any data set that is more than a few thousand rows, so I'd try to remodel it if possible - NOT IN should be avoided in general, especially if you also have a couple of joins.