Regex for ISO 8601 durations

rink.attendant.6 picture rink.attendant.6 · Aug 17, 2015 · Viewed 7.3k times · Source

I need a regular expression to validate durations in the ISO 8601 duration format (with the exception of fractional parts which I don't need).

PnYnMnDTnHnMnS

PnW

Here is what I have:

^P(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(\d+H)?(\d+M)?(\d+S)?)?$

The only problem is that the strings P and PT are allowed with this regex as all of the parts are "zero or one" ?.

  • There needs to be at least one component (date or time)
  • If there is a T then there needs to be a time component (H, M, or S)
  • If there is a T then there may or may not be any date components (Y, M, or D)
  • Overflow is allowed (e.g. P72H is mostly equivalent to P3D)

Acceptable inputs:

P1Y        // date component only
P2MT30M    // date and time components
PT6H       // time component only
P5W        // another date component

Unacceptable inputs:

P         // no components
PT        // no components
P3MT      // T specified but not time components

Right now the invalid strings are passing client-side validation but failing on the server-side because it's passed into DateInteval but I'd like to fail on the client side if possible. If everyone was using Chrome 40+ I could specify minlength='3' on the input element to help but that isn't the case unfortunately.

Answer

Wiktor Stribiżew picture Wiktor Stribiżew · Aug 17, 2015

If you have almost all parts optional, but you want to make sure there is something else after P or T, you can make use of look-aheads:

^P(?=\d+[YMWD])(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d+[HMS])(\d+H)?(\d+M)?(\d+S)?)?$
  ^^^^^^^^^^^^                               ^^^^^^^^^^^^

They require a sequence of digits followed by a letter from the specified set to appear right after the preceding pattern.

See demo

UPDATE

If P can be "empty", use

^P(?!$)(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d+[HMS])(\d+H)?(\d+M)?(\d+S)?)?$

See another demo. Here, (?!$) makes sure the string is not equal to P, and there must be some other symbols on the right.

Or, as @UlugbekUmirov suggests, it is enough to just use T(?=\d) (since all the optional parts start with a digit):

^P(?!$)(\d+Y)?(\d+M)?(\d+W)?(\d+D)?(T(?=\d)(\d+H)?(\d+M)?(\d+S)?)?$

UPDATE 2

If the numbers can be both float or integers, add (?:\.\d+)? after each \d+. Here is an updated pattern from Update 1:

^P(?!$)(\d+(?:\.\d+)?Y)?(\d+(?:\.\d+)?M)?(\d+(?:\.\d+)?W)?(\d+(?:\.\d+)?D)?(T(?=\d)(\d+(?:\.\d+)?H)?(\d+(?:\.\d+)?M)?(\d+(?:\.\d+)?S)?)?$