Is there a standard for inclusive/exclusive ends of time intervals?

Gus picture Gus · Mar 20, 2012 · Viewed 11.4k times · Source

I'm wondering if there is a standard or "normal" means of interpreting time interval data end points with respect to inclusiveness/exclusiveness of the value defining the end point. Note however that I am asking what the standard (or most common) convention is (if there is one), not for a dissertation on your personal preference. If you really want to provide a dissertation, please attach it to a reference to someone's published standard or a standard text on the matter. Open standards (that I don't have to pay to read) are greatly preferred unless they are fundamentally flawed :).

Of course there are 4 possibilities for a time interval from A to B:

  1. (A, B) - Both ends are exclusive.
  2. [A, B] - Both ends are inclusive.
  3. [A, B) - Start is inclusive and end is exclusive
  4. (A, B] - Start is exclusive and end is inclusive

Each of these has different characteristics (as I see it, feel free to point out more)

The [A, B] convention would have the seemingly inconvenient property that B is contained withing the inteval [A, B] and also [B, C]. This is particularly inconvenient if B is meant to represent the midnight boundary and you are trying to determine which day it falls on for example. Also, this means the duration of the interval is slightly irritatig to calculate since [A, B] where A = B should have a length of 1 and therefore the duration of [A, B] is (B - A) + 1

Similarly the (A, B) convention would have the difficulty that B falls within neither (A,B) nor (B,C)... continuing the analogy with day boundaries, midnight would be part of neither day. This is also logically inconvenient because [A, B] where A = B is a non-sense interval with duration less than zero, but reversing A and B does not make it a valid interval.

So I think I want either [A, B), or (A, B] and I can't figure out how to decide between them.

So if someone has a link to a standards document, reference to a standard text or similar that clarify the convention that would be great. Alternately, if you can link a variety of standards documents and/or references that more or less completely fail to agree, then I can just pick one that seems to have sufficient authority to CMA and be done with it :).

Finally, I will be working in Java, so I am particularly susceptible to answers that work well in Java.

Answer

Voo picture Voo · Mar 21, 2012

In the general case, [A, B) (inclusive start, exclusive end) has a lot going for it and I don't see any reason why the same wouldn't be true for time intervals.

Djikstra wrote a nice article about it Why numbering should start at zero which - despite the name - deals mostly with exactly this.

Short summary of the advantages:

  • end - start equals the number of items in the list
  • upper bound of preceding interval is the lower bound of the next
  • allows to index an interval starting from 0 with unsigned numbers [1]

Personally the second point is extremely useful for lots of problems; consider a pretty standard recursive function (in pseudo python):

def foo(start, end):
    if end - start == 1:
        # base case
    else:
        middle = start + (end - start) / 2
        foo(start, middle)
        foo(middle, end)

Writing the same with inclusive upper bound introduces lots of error prone off by one errors.

[1] That's the advantage compared to (A, B] - a interval starting from 0 is MUCH more common than an interval ending in MAX_VAL. Note that also relates to one additional problem: Using two inclusive bounds means we can denote a sequence whose length cannot be expressed with the same size.