How to compare the current row with next and previous row in PostgreSQL?

Renato Dinhani picture Renato Dinhani · Nov 2, 2011 · Viewed 84.9k times · Source

I want to know how to retrieve results in a SQL query doing some logic comparison with the next or previous rows. I'm using PostgreSQL.

Example
Supposing I have a table in my database with two attributes (ordered position and random numbers), I want to retrieve the odd numbers that are between even numbers. How can I do this?

The real usage
I want to find words that are between two another words which have the category NAME (and the word is not a name). The ordering is provided by sentence and position.

Edit I want to know if the Window function of PostgreSQL are best solution for this kind of problem than doing queries. I heard about them, but never used.

Answer

Renato Dinhani picture Renato Dinhani · Nov 2, 2011

This is my solution using WINDOW functions. I used the lag and lead functions. Both returns a value from a column from a row in offset from the current row. lag goes back and lead goes next in the offset.

SELECT tokcat.text
FROM (
    SELECT text, category, chartype, lag(category,1) OVER w as previousCategory, lead(category,1) OVER w as nextCategory
    FROM token t, textBlockHasToken tb
    WHERE tb.tokenId = t.id
    WINDOW w AS (
        PARTITION BY textBlockId, sentence
        ORDER BY textBlockId, sentence, position
    )
) tokcat
WHERE 'NAME' = ANY(previousCategory)
AND 'NAME' = ANY(nextCategory)
AND 'NAME' <> ANY(category)

Simplified version:

SELECT text
FROM (
    SELECT text
          ,category 
          ,lag(category) OVER w as previous_cat
          ,lead(category) OVER w as next_cat
    FROM   token t
    JOIN   textblockhastoken tb ON tb.tokenid = t.id
    WINDOW w AS (PARTITION BY textblockid, sentence ORDER BY position)
    ) tokcat
WHERE  category <> 'NAME'
AND    previous_cat = 'NAME'
AND    next_cat = 'NAME';

Major points

  • = ANY() is not needed, the window function returns a single value
  • some redundant fields in the subquery
  • no need to order by columns, that you PARTITION BY - the ORDER BY applies within partitions
  • Don't use mixed case identifiers without quoting, it only leads to confusion. (Better yet: don't use mixed case identifiers in PostgreSQL ever)