Generate_series in Postgres from start and end date in a table

SiriusBits picture SiriusBits · Mar 13, 2015 · Viewed 19.1k times · Source

I have been trying to generate a series of dates (YYYY-MM-DD HH) from the first until the last date in a timestamp field. I've got the generate_series() I need, however running into an issue when trying to grab the start and end dates from a table. I have the following to give a rough idea:

with date1 as
(
SELECT start_timestamp as first_date
FROM header_table
ORDER BY start_timestamp DESC
LIMIT 1
),
date2 as
(
SELECT start_timestamp as first_date
FROM header_table
ORDER BY start_timestamp ASC    
LIMIT 1
)
    select generate_series(date1.first_date, date2.first_date
                         , '1 hour'::interval)::timestamp as date_hour

from
(   select * from date1
    union
    select * from date2) as foo

Postgres 9.3

Answer

Erwin Brandstetter picture Erwin Brandstetter · Mar 13, 2015

You don't need a CTE for this, that would be more expensive than necessary.
And you don't need to cast to timestamp, the result already is of data type timestamp when you feed timestamp types to generate_series(). Details here:

In Postgres 9.3 or later you can use a LATERAL join:

SELECT to_char(ts, 'YYYY-MM-DD HH24') AS formatted_ts
FROM  (
   SELECT min(start_timestamp) as first_date
        , max(start_timestamp) as last_date
   FROM   header_table
   ) h
  , generate_series(h.first_date, h.last_date, interval '1 hour') g(ts);

Optionally with to_char() to get the result as text in the format you mentioned.

This works in any Postgres version:

SELECT generate_series(min(start_timestamp)
                     , max(start_timestamp)
                     , interval '1 hour') AS ts
FROM   header_table;

Typically a bit faster.
Calling set-returning functions in the SELECT list is a non-standard-SQL feature and frowned upon by some. Also, there were behavioral oddities (though not for this simple case) that were eventually fixed in Postgres 10. See:

Note a subtle difference in NULL handling:

The equivalent of

max(start_timestamp)

is obtained with

ORDER BY start_timestamp DESC NULLS LAST
LIMIT 1

Without NULLS LAST NULL values come first in descending order (if there can be NULL values in start_timestamp). You would get NULL for last_date and your query would come up empty.

Details: