When do extra parentheses have an effect, other than on operator precedence?

TemplateRex picture TemplateRex · Jun 9, 2014 · Viewed 7.4k times · Source

Parentheses in C++ are used in many places: e.g. in function calls and grouping expressions to override operator precedence. Apart from illegal extra parentheses (such as around function call argument lists), a general -but not absolute- rule of C++ is that extra parentheses never hurt:

5.1 Primary expressions [expr.prim]

5.1.1 General [expr.prim.general]

6 A parenthesized expression is a primary expression whose type and value are identical to those of the enclosed expression. The presence of parentheses does not affect whether the expression is an lvalue. The parenthesized expression can be used in exactly the same contexts as those where the enclosed expression can be used, and with the same meaning, except as otherwise indicated.

Question: in which contexts do extra parentheses change the meaning of a C++ program, other than overriding basic operator precedence?

NOTE: I consider the restriction of pointer-to-member syntax to &qualified-id without parentheses to be outside the scope because it restricts syntax rather than allowing two syntaxes with different meanings. Similarly, the use of parentheses inside preprocessor macro definitions also guards against unwanted operator precedence.

Answer

TemplateRex picture TemplateRex · Jun 9, 2014

TL;DR

Extra parentheses change the meaning of a C++ program in the following contexts:

  • preventing argument-dependent name lookup
  • enabling the comma operator in list contexts
  • ambiguity resolution of vexing parses
  • deducing referenceness in decltype expressions
  • preventing preprocessor macro errors

Preventing argument-dependent name lookup

As is detailed in Annex A of the Standard, a post-fix expression of the form (expression) is a primary expression, but not an id-expression, and therefore not an unqualified-id. This means that argument-dependent name lookup is prevented in function calls of the form (fun)(arg) compared to the conventional form fun(arg).

3.4.2 Argument-dependent name lookup [basic.lookup.argdep]

1 When the postfix-expression in a function call (5.2.2) is an unqualified-id, other namespaces not considered during the usual unqualified lookup (3.4.1) may be searched, and in those namespaces, namespace-scope friend function or function template declarations (11.3) not otherwise visible may be found. These modifications to the search depend on the types of the arguments (and for template template arguments, the namespace of the template argument). [ Example:

namespace N {
    struct S { };
    void f(S);
}

void g() {
    N::S s;
    f(s);   // OK: calls N::f
    (f)(s); // error: N::f not considered; parentheses
            // prevent argument-dependent lookup
}

—end example ]

Enabling the comma operator in list contexts

The comma operator has a special meaning in most list-like contexts (function and template arguments, initializer lists etc.). Parentheses of the form a, (b, c), d in such contexts can enable the comma operator compared to the regular form a, b, c, d where the comma operator does not apply.

5.18 Comma operator [expr.comma]

2 In contexts where comma is given a special meaning, [ Example: in lists of arguments to functions (5.2.2) and lists of initializers (8.5) —end example ] the comma operator as described in Clause 5 can appear only in parentheses. [ Example:

f(a, (t=3, t+2), c);

has three arguments, the second of which has the value 5. —end example ]

Ambiguity resolution of vexing parses

Backward compatibility with C and its arcane function declaration syntax can lead to surprising parsing ambiguities, known as vexing parses. Essentially, anything that can be parsed as a declaration will be parsed as one, even though a competing parse would also apply.

6.8 Ambiguity resolution [stmt.ambig]

1 There is an ambiguity in the grammar involving expression-statements and declarations: An expression-statement with a function-style explicit type conversion (5.2.3) as its leftmost subexpression can be indistinguishable from a declaration where the first declarator starts with a (. In those cases the statement is a declaration.

8.2 Ambiguity resolution [dcl.ambig.res]

1 The ambiguity arising from the similarity between a function-style cast and a declaration mentioned in 6.8 can also occur in the context of a declaration. In that context, the choice is between a function declaration with a redundant set of parentheses around a parameter name and an object declaration with a function-style cast as the initializer. Just as for the ambiguities mentioned in 6.8, the resolution is to consider any construct that could possibly be a declaration a declaration. [ Note: A declaration can be explicitly disambiguated by a nonfunction-style cast, by an = to indicate initialization or by removing the redundant parentheses around the parameter name. —end note ] [ Example:

struct S {
    S(int);
};

void foo(double a) {
    S w(int(a));  // function declaration
    S x(int());   // function declaration
    S y((int)a);  // object declaration
    S z = int(a); // object declaration
}

—end example ]

A famous example of this is the Most Vexing Parse, a name popularized by Scott Meyers in Item 6 of his Effective STL book:

ifstream dataFile("ints.dat");
list<int> data(istream_iterator<int>(dataFile), // warning! this doesn't do
               istream_iterator<int>());        // what you think it does

This declares a function, data, whose return type is list<int>. The function data takes two parameters:

  • The first parameter is named dataFile. It's type is istream_iterator<int>. The parentheses around dataFile are superfluous and are ignored.
  • The second parameter has no name. Its type is pointer to function taking nothing and returning an istream_iterator<int>.

Placing extra parentheses around the first function argument (parentheses around the second argument are illegal) will resolve the ambiguity

list<int> data((istream_iterator<int>(dataFile)), // note new parens
                istream_iterator<int>());          // around first argument
                                                  // to list's constructor

C++11 has brace-initializer syntax that allows to side-step such parsing problems in many contexts.

Deducing referenceness in decltype expressions

In contrast to auto type deduction, decltype allows referenceness (lvalue and rvalue references) to be deduced. The rules distinguish between decltype(e) and decltype((e)) expressions:

7.1.6.2 Simple type specifiers [dcl.type.simple]

4 For an expression e, the type denoted by decltype(e) is defined as follows:

— if e is an unparenthesized id-expression or an unparenthesized class member access (5.2.5), decltype(e) is the type of the entity named by e. If there is no such entity, or if e names a set of overloaded functions, the program is ill-formed;

— otherwise, if e is an xvalue, decltype(e) is T&&, where T is the type of e;

— otherwise, if e is an lvalue, decltype(e) is T&, where T is the type of e;

— otherwise, decltype(e) is the type of e.

The operand of the decltype specifier is an unevaluated operand (Clause 5). [ Example:

const int&& foo();
int i;
struct A { double x; };
const A* a = new A();
decltype(foo()) x1 = 0;   // type is const int&&
decltype(i) x2;           // type is int
decltype(a->x) x3;        // type is double
decltype((a->x)) x4 = x3; // type is const double&

—end example ] [ Note: The rules for determining types involving decltype(auto) are specified in 7.1.6.4. —end note ]

The rules for decltype(auto) have a similar meaning for extra parentheses in the RHS of the initializing expression. Here's an example from the C++FAQ and this related Q&A

decltype(auto) look_up_a_string_1() { auto str = lookup1(); return str; }  //A
decltype(auto) look_up_a_string_2() { auto str = lookup1(); return(str); } //B

The first returns string, the second returns string &, which is a reference to the local variable str.

Preventing preprocessor macro related errors

There is a host of subtleties with preprocessor macros in their interaction with the C++ language proper, the most common of which are listed below

  • using parentheses around macro parameters inside the macro definition #define TIMES(A, B) (A) * (B); in order to avoid unwanted operator precedence (e.g. in TIMES(1 + 2, 2 + 1) which yields 9 but would yield 6 without the parentheses around (A) and (B)
  • using parentheses around macro arguments having commas inside: assert((std::is_same<int, int>::value)); which would otherwise not compile
  • using parentheses around a function to protect against macro expansion in included headers: (min)(a, b) (with the unwanted side effect of also disabling ADL)