Best practices with saving datetime & timezone info in database when data is dependant on datetime

daneejela picture daneejela · Jul 7, 2017 · Viewed 46.8k times · Source

There were quite a few questions about saving datetime & timezones info in DB but more on the overall level. Here I'd like to address a specific case.

System specs

  • We have an Orders system database
  • It is a multi-tenant system where tenants can use arbitrary timezone (it is arbitrary but single timezone per tenant, saved in Tenants table once and never changes)

Business rule needed to be covered in DB

  • When tenant places an Order into the system, order number gets computed based on their local datetime (its not literally a number but some kind of an identifier like ORDR-13432-Year-Month-Day). Precise calculation is not important for the moment, it's just important that it's dependant on tenants local datetime
  • We also do want to be able to select all Orders, on the system level, placed between some UTC datetimes regardless of the tenant (for general system statistics/reporting)

Our initial idea

  • Our initial idea was to save UTC datetime across whole DB and, of course, keep tenants timezone offset relative to UTC and have application that consumes DB always convert datetimes to UTC so that DB itself always operate with UTC.

Approach 1

  • Saving local tenants datetime would be nice per tenant but then we have problem with queries like:

    SELECT * FROM ORDERS WHERE OrderDateTime BETWEEN UTCDateTime1 AND UTCDateTime2
    

It's problematic because OrderDateTime in this query means different moment in time, based on tenant. Of course, this query might include join to Tenants table to get local datetime offset which would then calculate OrderDateTime on the fly to make adjustments. It's possible, but not sure if it's a good way to do it?

Approach 2

  • On the other hand, when saving UTC datetime, then when we do calculation of OrderNumber since the day/month/year in UTC might differ from the one in local datetime

Let's take extreme example; let's say tenant is 6 hours ahead of UTC and his local datetime is 2017-01-01 02:00. UTC would be 2016-12-31 20:00. Order placed at that moment should get OrderNumber 'ORDR-13432-2017-1-1' but if saving UTC it would get ORDR-13432-2016-12-31.

In this case, at the moment of creating Order in DB, we should get UTC datetime, tenants offset and compile OrderNumber based on recalculated tenants localtime but still save DateTime column in UTC.

Questions

  1. What is the preferred way of handling this kind of situation?
  2. Is there a nice solution with saving UTC datetimes because that one would be pretty nice for us because of the system-level reporting?
  3. If going with saving UTC, is Approach 2) good way to handle those cases or is there some better/recommended way?

[UPDATE]

Based on comments from Gerard Ashton and Hugo:

Initial question was not clear with respect to the detail if tenant can change timezone or not and what happens if political authority changes the timezone properties or some terirory's timezone. Of course that is of an extreme importance, but it is not in the center of this question . We might address that in a separate question.

For the sake of this question, lets assume tenant will not change location. Timezone properties or timezone itself for that location might change and those changes will be handled in the system separately from this question.

Answer

Matt Johnson-Pint picture Matt Johnson-Pint · Jul 7, 2017

Hugo's answer is mostly correct, but I'll add a few key points:

  • When you're storing the customer's time zone, do NOT store a numerical offset. As others have pointed out, the offset from UTC is only for a single point in time, and can easily change for DST and for other reasons. Instead, you should store a time zone identifier, preferably an IANA time zone identifier as a string, such as "America/Los_Angeles". Read more in the timezone tag wiki.

  • Your OrderDateTime field should absolutely represent the time in UTC. However, depending on your database platform, you have several choices for how to store this.

    • For example, if using Microsoft SQL Server, a good approach is to store the local time in a datetimeoffset column, which preserves the offset from UTC. Note that any index you create on that column will be based on the UTC equivalent, so you will get good query performance when doing your range query.

    • If using other database platforms, you may instead wish to store the UTC value in a timestamp field. Some databases also have timestamp with time zone, but understand that it doesn't mean it stores the time zone or offset, it just means that it can do conversions for you implicitly as you store and retrieve values. If you intend to always represent UTC, then often timestamp (without time zone) or just datetime is more appropriate.

  • Since either of the above methods will store a UTC time, you'll also need to consider how to perform operations that need an index of local time values. For example, you might need to create a daily report, based on the day of the user's time zone. For that, you'd need to group by the local date. If you try to compute that at query time from your UTC value, you'll end up scanning the entire table.

    A good approach to deal with this is to create a separate column for the local date (or perhaps even the local datetime depending on your needs, but not a datetimeoffset or timestamp). This could be a completely isolated column that you populate separately, or it could be a computed/calculated column based on your other column. Use this column in an index so you can filter or group by local date.

  • If you go for the computed-column approach, you'll need to know how to convert between time zones in the database. Some databases have a convert_tz function built-in that understands IANA time zone identifiers.

    If you're using Microsoft SQL Server, you can use the new AT TIME ZONE function in SQL 2016 and Azure SQL DB, but that only works with Microsoft time zone identifiers. To use IANA time zone identifiers, you'll need a third party solution, such as my SQL Server Time Zone Support project.

  • At query time, avoid using the BETWEEN statement. It is fully inclusive. It works ok for whole dates, but when you have time involved you're better off doing a half-open range query, such as:

    ... WHERE OrderDateTime >= @t1 AND OrderDateTime < @t2
    

    For example, if @t1 were the start of today, @t2 would be the start of tomorrow.

Regarding the scenario discussed in comments where the user's time zone has changed:

  • If you choose to calculate the local date in the database, the only scenario you need to worry about is if a location or business switches time zones without a "zone split" occurring. A zone split is when a new time zone identifier is introduced which covers the area that changed, including their old and new rules.

    For example, the latest zone added to the IANA tzdb at the time of writing this is America/Punta_Arenas, which was a zone split when the southern part of Chile decided to stay at UTC-3 when the rest of Chile (America/Santiago) went back to UTC-4 at the end of DST.

    However, if a minor locality on the border of two time zones decides to change which side they follow, and a zone split wasn't warranted, then you'd potentially be using the rules of their new time zone against their old data.

  • If you store the local date separately (computed in the application, not the DB), then you'll have no problems. The user changes their time zone to the new one, all old data is still intact, and new data is stored with the new time zone.