SQL Query to show gaps between multiple date ranges

Purplegoldfish picture Purplegoldfish · Mar 7, 2012 · Viewed 25.8k times · Source

Im working on a SSRS / SQL project and trying to write a query to get the gaps between dates and I am completely lost with how to write this.Basically we have a number of devices which can be scheduled for use and I need a report to show when they are not in use.

I have a table with Device ID, EventStart and EventEnd times, I need to run a query to get the times between these events for each device but I am not really sure how to do this.

For example:

Device 1 Event A runs from `01/01/2012 08:00 - 01/01/2012 10:00`
Device 1 Event B runs from `01/01/2012 18:00 - 01/01/2012 20:00`    
Device 1 Event C runs from `02/01/2012 18:00 - 02/01/2012 20:00`    
Device 2 Event A runs from `01/01/2012 08:00 - 01/01/2012 10:00`
Device 2 Event B runs from `01/01/2012 18:00 - 01/01/2012 20:00`

My query should have as its result

`Device 1 01/01/2012 10:00 - 01/01/2012 18:00`
`Device 1 01/01/2012 20:00 - 02/01/2012 18:00`
`Device 2 01/01/2012 10:00 - 01/01/2012 18:00`

There will be around 4 - 5 devices on average in this table, and maybe 200 - 300 + events.

Updates:

Ok I'll update this to try give a bit more info since I dont seem to have explained this too well (sorry!)

What I am dealing with is a table which has details for Events, Each event is a booking of a flight simulator, We have a number of flight sims( refered to as devices in the table) and we are trying to generate a SSRS report which we can give to a customer to show the days / times each sim is available.

So I am going to pass in a start / end date parameter and select all availabilities between those dates. The results should then display as something like:

Device   Available_From       Available_To
 1       01/01/2012 10:00    01/01/2012 18:00`
 1       01/01/2012 20:00    02/01/2012 18:00`
 2       01/01/2012 10:00    01/01/2012 18:00`

Also Events can sometimes overlap though this is very rare and due to bad data, it doesnt matter about an event on one device overlapping an event on a different device as I need to know availability for each device seperately.

Answer

Branko Dimitrijevic picture Branko Dimitrijevic · Mar 7, 2012

The Query:

Assuming the fields containing the interval are named Start and Finish, and the table is named YOUR_TABLE, the query...

SELECT Finish, Start
FROM
    (
        SELECT DISTINCT Start, ROW_NUMBER() OVER (ORDER BY Start) RN
        FROM YOUR_TABLE T1
        WHERE
            NOT EXISTS (
                SELECT *
                FROM YOUR_TABLE T2
                WHERE T1.Start > T2.Start AND T1.Start < T2.Finish
            )
        ) T1
    JOIN (
        SELECT DISTINCT Finish, ROW_NUMBER() OVER (ORDER BY Finish) RN
        FROM YOUR_TABLE T1
        WHERE
            NOT EXISTS (
                SELECT *
                FROM YOUR_TABLE T2
                WHERE T1.Finish > T2.Start AND T1.Finish < T2.Finish
            )
    ) T2
    ON T1.RN - 1 = T2.RN
WHERE
    Finish < Start

...gives the following result on your test data:

Finish                      Start
2012-01-01 10:00:00.000     2012-01-01 18:00:00.000

The important property of this query is that it would work on overlapping intervals as well.


The Algorithm:

1. Merge Overlapping Intervals

The subquery T1 accepts only those interval starts that are outside other intervals. The subquery T2 does the same for interval ends. This is what removes overlaps.

The DISTINCT is important in case there are two identical interval starts (or ends) that are both outside other intervals. The WHERE Finish < Start simply eliminates any empty intervals (i.e. duration 0).

We also attach a row number relative to temporal ordering, which will be needed in the next step.

The T1 yields:

Start                       RN
2012-01-01 08:00:00.000     1
2012-01-01 18:00:00.000     2

The T2 yields:

Finish                      RN
2012-01-01 10:00:00.000     1
2012-01-01 20:00:00.000     2

2. Reconstruct the Result

We can now reconstruct either the "active" or the "inactive" intervals.

The inactive intervals are reconstructed by putting together end of the previous interval with the beginning of the next one, hence - 1 in the ON clause. Effectively, we put...

Finish                      RN
2012-01-01 10:00:00.000     1

...and...

Start                       RN
2012-01-01 18:00:00.000     2

...together, resulting in:

Finish                      Start
2012-01-01 10:00:00.000     2012-01-01 18:00:00.000

(The active intervals could be reconstructed by putting rows from T1 alongside rows from T2, by using JOIN ... ON T1.RN = T2.RN and reverting WHERE.)


The Example:

Here is a slightly more realistic example. The following test data:

Device      Event      Start                      Finish
Device 1    Event A    2012-01-01 08:00:00.000    2012-01-01 10:00:00.000
Device 2    Event B    2012-01-01 18:00:00.000    2012-01-01 20:00:00.000
Device 3    Event C    2012-01-02 11:00:00.000    2012-01-02 15:00:00.000
Device 4    Event D    2012-01-02 10:00:00.000    2012-01-02 12:00:00.000
Device 5    Event E    2012-01-02 10:00:00.000    2012-01-02 15:00:00.000
Device 6    Event F    2012-01-03 09:00:00.000    2012-01-03 10:00:00.000

Gives the following result:

Finish                      Start
2012-01-01 10:00:00.000     2012-01-01 18:00:00.000
2012-01-01 20:00:00.000     2012-01-02 10:00:00.000
2012-01-02 15:00:00.000     2012-01-03 09:00:00.000