`uuuu` versus `yyyy` in `DateTimeFormatter` formatting pattern codes in Java?

Basil Bourque picture Basil Bourque · Dec 16, 2016 · Viewed 16k times · Source

The DateTimeFormatter class documentation says about its formatting codes for the year:

u year year 2004; 04

y year-of-era year 2004; 04

Year: The count of letters determines the minimum field width below which padding is used. If the count of letters is two, then a reduced two digit form is used. For printing, this outputs the rightmost two digits. For parsing, this will parse using the base value of 2000, resulting in a year within the range 2000 to 2099 inclusive. If the count of letters is less than four (but not two), then the sign is only output for negative years as per SignStyle.NORMAL. Otherwise, the sign is output if the pad width is exceeded, as per SignStyle.EXCEEDS_PAD.

No other mention of “era”.

So what is the difference between these two codes, u versus y, year versus year-of-era?

When should I use something like this pattern uuuu-MM-dd and when yyyy-MM-dd when working with dates in Java?

Seems that example code written by those in the know use uuuu, but why?

Other formatting classes such as the legacy SimpleDateFormat have only yyyy, so I am confused why java.time brings this uuuu for “year of era”.

Answer

Meno Hochschild picture Meno Hochschild · Dec 16, 2016

Within the scope of java.time-package, we can say:

  • It is safer to use "u" instead of "y" because DateTimeFormatter will otherwise insist on having an era in combination with "y" (= year-of-era). So using "u" would avoid some possible unexpected exceptions in strict formatting/parsing. See also this SO-post. Another minor thing which is improved by "u"-symbol compared with "y" is printing/parsing negative gregorian years (in far past).

  • Otherwise we can clearly state that using "u" instead of "y" breaks long-standing habits in Java-programming. It is also not intuitively clear that "u" denotes any kind of year because a) the first letter of the English word "year" is not in agreement with this symbol and b) SimpleDateFormat has used "u" for a different purpose since Java-7 (ISO-day-number-of-week). Confusion is guaranteed - for ever?

  • We should also see that using eras (symbol "G") in context of ISO is in general dangerous if we consider historic dates. If "G" is used with "u" then both fields are unrelated to each other. And if "G" is used with "y" then the formatter is satisfied but still uses proleptic gregorian calendar when the historic date mandates different calendars and date-handling.

Background information:

When developing and integrating the JSR 310 (java.time-packages) the designers decided to use Common Locale Data Repository (CLDR)/LDML-spec as the base of pattern symbols in DateTimeFormatter. The symbol "u" was already defined in CLDR as proleptic gregorian year, so this meaning was adopted to new upcoming JSR-310 (but not to SimpleDateFormat because of backwards compatibility reasons).

However, this decision to follow CLDR was not quite consistent because JSR-310 had also introduced new pattern symbols which didn't and still don't exist in CLDR, see also this old CLDR-ticket. The suggested symbol "I" was changed by CLDR to "VV" and finally overtaken by JSR-310, including new symbols "x" and "X". But "n" and "N" still don't exist in CLDR, and since this old ticket is closed, it is not clear at all if CLDR will ever support it in the sense of JSR-310. Furthermore, the ticket does not mention the symbol "p" (padding instruction in JSR-310, but not defined in CLDR). So we have still no perfect agreement between pattern definitions across different libraries and languages.

And about "y": We should also not overlook the fact that CLDR associates this year-of-era with at least some kind of mixed Julian/Gregorian year and not with the proleptic gregorian year as JSR-310 does (leaving the oddity of negative years aside). So no perfect agreement between CLDR and JSR-310 here, too.