AldenB
AldenB

Reputation: 133

How does a C++ compiler expand prefix and postfix operator++()?

Consider:

class Example
{
private:
    int m_i;

public:
    Example(int i) : m_i{i} {}

    //Post-fix
    Example operator++(int) {m_i++; return *this;}

    //Pre-fix
    Example& operator++() {m_i++; return *this;}

    void print() const {std::cout << m_i << '\n'}
};

I was experimenting with this to determine how the compiler expanded a call to the prefix and postfix operators.

For example, when I write something like this:

Example e = 1;
e++;

I expected it to expand to something like "e.operator++(int)", or taking it a step further, I expected

e++(2);

to expand to something like "e.operator++(2).", but what I get instead is the compiler complaining about some "no match for call to '(Example) (int)'".

Next I was curious as to how "++e" mysteriously expanded into "e.operator++()", i.e. the one that returns a reference.

Playing around some more, I ended up with:

Example e = 1;
++e++;
e.print();

Which printed 2, and then:

Example e = 1;
(++e)++;
e.print();

Which printed 3.

I understand that (++e) returns a reference to the object which is then post-incremented by one, so this makes sense. I also suspect that "++e++" is giving the postfix operator precedence here (as I read in another post), so this is incrementing the temporary variable returned by the postfix operator. This too makes sense. This led me to wonder about how expressions like

++++e
++e++++
++++e++
++++e++++

are expanded (they all compile and run with expected results).

So really, what the hell is going on on the inside and how does the compiler know which operator++() to call, and how are these expressions expanded (especially in the prefix case)? What is the purpose of the placeholder variable in "operator++(int)"?

Upvotes: 4

Views: 903

Answers (1)

M.M
M.M

Reputation: 141628

What is the purpose of the placeholder variable in "operator++(int)"?

Because the ++ operator has two distinct functions: postfix-++ and prefix-++. Therefore, when overloading it, there must be two different function signatures.

how does the compiler know which operator++() to call,

When your code uses prefix-++ (for example: ++e;), the function with signature operator++() is called. When your code uses postfix-++ (for example: e++;) , the function with signature operator++(int) is called, and the compiler will supply an unspecified dummy argument value.

Technically, the implementation of operator++(int) could use the dummy argument value. And you could pass your own value by writing e.operator++(5); instead of e++;. But this would be considered bad coding style -- when overloading operators, it's recommended to retain the semantics of the built-in operators to avoid confusing people reading the code.

Note that your current implementation of postfix-++ does not respect this rule: the normal semantic is that the previous value should be returned; but your code returns the updated value.

++e++++;

For parsing this statement you need to know about these parsing rules:

  • Tokens are parsed by "maximal munch", i.e. this means ++ e ++ ++; (and not some unary-+ operators).
  • The language grammar determines from these tokens which expressions are the operand of which operators. This process can be summarised in a precedence table.

Consulting the table for that expression tells you: ++(((e++)++)). Using the expansion I mentioned earlier, this can be written in function call notation:

((e.operator++(0)).operator++(0)).operator++();

These functions need to be called left-to-right in this case, because a member function cannot be entered before the expression it's being called on has been evaluated.

So, supposing we had Example e(1); before this statement, the following function calls occur in this order:

  • e.operator++(int) - sets e.m_i to 2 and returns a temporary (I'll call it temp1 as pseudocode) with temp1.m_i as 2.
  • temp1.operator++(int) - sets temp1.m_i to 3, and returns temp2 whose m.i is 3
  • temp2.operator++() - sets temp2.m_i to 4 and returns a reference to temp2.

NB. My answer only talks about the overloaded operator being a member function. It's also possible to overload ++ (both forms) as a non-member. In that case, the behaviour would be unchanged from my description, but the "written in function call notation" expression would take different syntax.

Upvotes: 4

Related Questions