erjot
erjot

Reputation: 1412

Obfuscated way of accessing a character in string

I found today interesting piece of code:

auto ch = (double(), (float(), int()))["\t\a\r\n\0"]["abcdefghij"];

which works same as:

char str[] = "abcdefghij";
char ch = str['\t'];

Why is it even possible? Especially why is the compiler picking first char from string and using it as subscript instead of throwing error?

Upvotes: 7

Views: 1241

Answers (2)

Tyler McHenry
Tyler McHenry

Reputation: 76660

So first of all, all that double and float stuff is pure misdirection. The comma operator's return value is its right-side argument, so (double(), (float(), int())) boils down to just int(), although it creates and discards a double and a float value along the way. So consider:

 auto ch = int()["\t\a\r\n\0"]["abcdefghij"];

The first part of this that will be evaluated is

 int()["\t\a\r\n\0"]

Now, recognize that int() default-constructs an integer, which gives it the value 0. So the statement is equivalent to:

 0["\t\a\r\n\0"]

It's a fairly well known trick in C and C++ that a[b] and b[a] are equivalent, since the subscript operator is defined as a[b] === *(a + b) and addition is commutative. So this is really the same as:

 "\t\a\r\n\0"[0]

which is of course equal to '\t'. Now the full piece of code is:

 auto ch = '\t'["abcdefghij"];

which for the same reason is equivalent to:

 auto ch = "abcdefghij"['\t'];

Which of course could also be written as

char str[] = "abcdefghij";
char ch = str['\t'];

If you gave the "abcdefghij" string a name and forwent the use of the C++0x auto keyword when declaring ch.

Finally, note that \t is equal to 9 since the tab character has ASCII value 9, so str['\t'] is the same as str[9]. str consists of 10 characters followed by a NUL character terminator (\0), which is implicitly added to the string literal that it was initialized with.

So in both cases the final value of ch is 'j'.

Upvotes: 12

Yakov Galka
Yakov Galka

Reputation: 72479

I'll explain as rewrite:

auto ch = (double(), (float(), int()))["\t\a\r\n\0"]["abcdefghij"];

is equivalent to (just evaluate all the double, float, int temporaries with comma operator)

auto ch = (0["\t\a\r\n\0"])["abcdefghij"];

Now the standard says that:

x[y] == *(x + y)

No matter which one is a pointer. so you get:

0["\t\a\r\n\0"] == "\t\a\r\n\0"[0] == '\t';

Upvotes: 8

Related Questions