Postgres generate_series

yokoloko picture yokoloko · Aug 21, 2011 · Viewed 10k times · Source

What I want to is to make statistics on a table and for this I'm using generate_series();

Here is what I'm doing:

SELECT x.month, amount
FROM (SELECT generate_series(
                 min(date_trunc('month', date)),
                 max(date_trunc('month', date)),
                 '1 month'
      ) AS month
      FROM table
      WHERE user_id = 55 AND ...
) x
LEFT JOIN (
      SELECT SUM(amount) AS amount, date_trunc('month', date) AS month
      FROM table
      WHERE user_id = 55 AND ...
      GROUP BY month
) q ON q.month = x.month
ORDER BY month

This works well but when I want to apply filters like get the amount for specifics users I have to apply them twice. Is there a way to avoid filtering twice, or to rewrite this in a more efficient way because I'm not sure if it's the right way to do it?

Answer

Grzegorz Szpetkowski picture Grzegorz Szpetkowski · Aug 21, 2011

You could write WITH query for this:

WITH month_amount AS
(
    SELECT
        sum(amount) AS amount,
        date_trunc('month', date) AS month
    FROM Amount
    WHERE user_id = 55 -- AND ...
    GROUP BY month
)
SELECT month, amount
FROM
    (SELECT generate_series(min(month), max(month), '1 month') AS month
    FROM month_amount) x
LEFT JOIN month_amount
USING (month)
ORDER BY month;

Example result:

SELECT * FROM amount WHERE user_id = 55;
 amount_id | user_id | amount |    date    
-----------+---------+--------+------------
         3 |      55 |      7 | 2011-03-16
         4 |      55 |      5 | 2011-03-22
         5 |      55 |      2 | 2011-05-07
         6 |      55 |     18 | 2011-05-27
         7 |      55 |      4 | 2011-06-14
(5 rows)

WITH month_amount ..
         month          | amount 
------------------------+--------
 2011-03-01 00:00:00+01 |     12
 2011-04-01 00:00:00+02 |       
 2011-05-01 00:00:00+02 |     20
 2011-06-01 00:00:00+02 |      4
(4 rows)