Cumulative sum of values by month, filling in for missing months

Goku picture Goku · Sep 10, 2015 · Viewed 11.8k times · Source

I have this data table and I'm wondering if is possible create a query that get a cumulative sum by month considering all months until the current month.

date_added                    | qty
------------------------------------
2015-08-04 22:28:24.633784-03 | 1
2015-05-20 20:22:29.458541-03 | 1
2015-04-08 14:16:09.844229-03 | 1
2015-04-07 23:10:42.325081-03 | 1
2015-07-06 18:50:30.164932-03 | 1
2015-08-22 15:01:54.03697-03  | 1
2015-08-06 18:25:07.57763-03  | 1
2015-04-07 23:12:20.850783-03 | 1
2015-07-23 17:45:29.456034-03 | 1
2015-04-28 20:12:48.110922-03 | 1
2015-04-28 13:26:04.770365-03 | 1
2015-05-19 13:30:08.186289-03 | 1
2015-08-06 18:26:46.448608-03 | 1
2015-08-27 16:43:06.561005-03 | 1
2015-08-07 12:15:29.242067-03 | 1

I need a result like that:

Jan|0
Feb|0
Mar|0
Apr|5
May|7
Jun|7
Jul|9
Aug|15

Answer

Erwin Brandstetter picture Erwin Brandstetter · Sep 10, 2015

This is very similar to other questions, but the best query is still tricky.

Basic query to get the running sum quickly:

SELECT to_char(date_trunc('month', date_added), 'Mon YYYY') AS mon_text
     , sum(sum(qty)) OVER (ORDER BY date_trunc('month', date_added)) AS running_sum
FROM   tbl
GROUP  BY date_trunc('month', date_added)
ORDER  BY date_trunc('month', date_added);

The tricky part is to fill in for missing months:

WITH cte AS (
   SELECT date_trunc('month', date_added) AS mon, sum(qty) AS mon_sum
   FROM   tbl
   GROUP  BY 1
   )
SELECT to_char(mon, 'Mon YYYY') AS mon_text
     , sum(c.mon_sum) OVER (ORDER BY mon) AS running_sum
FROM  (SELECT min(mon) AS min_mon FROM cte) init
     , generate_series(init.min_mon, now(), interval '1 month') mon
LEFT   JOIN cte c USING (mon)
ORDER  BY mon;

The implicit CROSS JOIN LATERAL requires Postgres 9.3+. This starts with the first month in the table.
To start with a given month:

WITH cte AS (
   SELECT date_trunc('month', date_added) AS mon, sum(qty) AS mon_sum
   FROM   tbl
   GROUP  BY 1
   )
SELECT to_char(mon, 'Mon YYYY') AS mon_text
     , COALESCE(sum(c.mon_sum) OVER (ORDER BY mon), 0) AS running_sum
FROM   generate_series('2015-01-01'::date, now(), interval '1 month') mon
LEFT   JOIN cte c USING (mon)
ORDER  BY mon;

SQL Fiddle.

Keeping months from different years apart. You did not ask for that, but you'll most likely want it.

Note that the "month" to some degree depends on the time zone setting of the current session! Details:

Related: