C99 printf formatters vs C++11 user-defined-literals

rubenvb picture rubenvb · Aug 8, 2012 · Viewed 7.6k times · Source

This code:

#define __STDC_FORMAT_MACROS
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(int argc,char **argv)
{
   uint64_t val=1234567890;
   printf("%"PRId64"\n",val);
   exit(0);
}

Works for C99, C++03, C++11 according to GCC 4.5, but fails on C++11 according to GCC 4.7.1. Adding a space before PRId64 lets GCC 4.7.1 compile it.

Which one is correct?

Answer

ecatmur picture ecatmur · Aug 8, 2012

gcc 4.7.1 is correct. According to the standard,

2.2 Phases of translation [lex.phases]

1 - The precedence among the syntax rules of translation is specified by the following phases. [...]
3. The source file is decomposed into preprocessing tokens (2.5) and sequences of white-space characters (including comments). [...]
4. Preprocessing directives are executed, macro invocations are expanded, [...]

And per 2.5 Preprocessing tokens [lex.pptoken], user-defined-string-literal is a preprocessing token production:

2.14.8 User-defined literals [lex.ext]

user-defined-string-literal:
    string-literal ud-suffix
ud-suffix:
    identifier

So the phase-4 macro expansion of PRId64 is irrelevant, because "%"PRId64 has already been parsed as a single user-defined-string-literal preprocessing token consisting of string-literal "%" and ud-suffix PRId64.

Oh, this is going to be awesome; everyone will have to change

printf("%"PRId64"\n", val);

to

printf("%" PRId64"\n", val);     // note extra space

However! gcc and clang have agreed to treat user-defined string literals without a leading underscore on the suffix as two separate tokens (per the non well formedness criterion), see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52538 so for future versions of gcc (4.8 branch, I think) existing code will work again.