Guidance on using the WITH clause in SQL

cc young picture cc young · Jan 4, 2012 · Viewed 25k times · Source

I understand how to use the WITH clause for recursive queries (!!), but I'm having problems understanding its general use / power.

For example the following query updates one record whose id is determined by using a subquery returning the id of the first record by timestamp:

update global.prospect psp
set    status=status||'*'
where  psp.psp_id=(
           select  p2.psp_id
           from    global.prospect p2
           where   p2.status='new' or p2.status='reset'
           order   by p2.request_ts
           limit   1 )
returning psp.*;

Would this be a good candidate for using a WITH wrapper instead of the relatively ugly sub-query? If so, why?

Answer

Erwin Brandstetter picture Erwin Brandstetter · Jan 4, 2012

If there can be concurrent write access to involved tables, there are race conditions in the above following queries. Consider:


Your example can use a CTE (common table expression), but it will give you nothing a subquery couldn't do:

WITH x AS (
   SELECT  psp_id
   FROM    global.prospect
   WHERE   status IN ('new', 'reset')
   ORDER   BY request_ts
   LIMIT   1
   )
UPDATE global.prospect psp
SET    status = status || '*'
FROM   x
WHERE  psp.psp_id = x.psp_id
RETURNING psp.*;

BTW, the returned row will be the updated version.


If you wanted to insert the returned row into another table, that's where a WITH clause becomes essential:

WITH x AS (
   SELECT  psp_id
   FROM    global.prospect
   WHERE   status IN ('new', 'reset')
   ORDER   BY request_ts
   LIMIT   1
   ), y AS (
   UPDATE global.prospect psp
   SET    status = status || '*'
   FROM   x
   WHERE  psp.psp_id = x.psp_id
   RETURNING psp.*
   )
INSERT INTO z
SELECT *
FROM   y

Data modifying queries using CTE are possible with PostgreSQL 9.1 or later.
Read more in the excellent manual.