Reputation: 143
This code is from Hacker's Delight. It says this is the shortest such program in C and is 64 characters in length, but I don't understand it:
main(a){printf(a,34,a="main(a){printf(a,34,a=%c%s%c,34);}",34);}
I tried to compile it. It compiles with 3 warnings and no error.
Upvotes: 13
Views: 1134
Reputation: 106012
This program relies upon the assumptions that
main
is int
int
by default and a="main(a){printf(a,34,a=%c%s%c,34);}"
will be evaluated first. It will invoke undefined behavior. Order of evaluation of arguments of a function is not guaranteed in C.
Albeit, this program works as follows:
The assignment expression a="main(a){printf(a,34,a=%c%s%c,34);}"
will assign the string "main(a){printf(a,34,a=%c%s%c,34);}"
to a
and the value of the assignment expression would be "main(a){printf(a,34,a=%c%s%c,34);}"
too as per C standard --C11: 6.5.16
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment [...]
Taking in mind the above semantic of assignment operator the program will be expanded as
main(a){
printf("main(a){printf(a,34,a=%c%s%c,34);}",34,a="main(a){printf(a,34,a=%c%s%c,34);}",34);
}
ASCII 34
is "
. Specifiers and its corresponding arguments:
%c ---> 34
%s ---> "main(a){printf(a,34,a=%c%s%c,34);}"
%c ---> 34
A better version would be
main(a){a="main(a){a=%c%s%c;printf(a,34,a,34);}";printf(a,34,a,34);}
It is 4
character longer but at least follows K&R C.
Upvotes: 8
Reputation: 302942
This works based on lots of quirks that C allows you to do, and some undefined behavior that happens to work in your favor. In order:
main(a) { ...
Types are assumed to be int
if unspecified, so this is equivalent to:
int main(int a) { ...
Even though main
is supposed to take either 0 or 2 arguments, and this is undefined behavior, this can be allowed as just ignoring the missing second argument.
Next, the body, which I will space out. Note that a
is an int
as per main
:
printf(a,
34,
a = "main(a){printf(a,34,a=%c%s%c,34);}",
34);
The order of evaluation of arguments is undefined, but we're relying on the 3rd argument - the assignment - getting evaluated first. We're also relying on the undefined behavior of being able to assign a char *
to an int
. Also, note that 34 is the ASCII value of "
. Thus, the intended impact of the program is:
int main(int a, char** ) {
printf("main(a){printf(a,34,a=%c%s%c,34);}",
'"',
"main(a){printf(a,34,a=%c%s%c,34);}",
'"');
return 0; // also left off
}
Which, when evaluated, produces:
main(a){printf(a,34,a="main(a){printf(a,34,a=%c%s%c,34);}",34);}
which was the original program. Tada!
Upvotes: 4
Reputation: 138051
It relies on several quirks of the C language and (what I think is) undefined behavior.
First, it defines the main
function. It is legal to declare a function without a return type or parameter types, and they will be presumed to be int
. This is why the main(a){
part works.
Then, it calls printf
with 4 parameters. Since it has no prototype, it is assumed to return int
and accept int
parameters (unless your compiler implicitly declares it otherwise, like Clang does).
The first parameter is presumed int
and is argc
at the beginning of the program. The second parameter is 34 (which is ASCII for the double-quote character). The third parameter is an assignment expression that assigns the format string to a
and returns it. It relies on a pointer-to-int conversion, which is legal in C. The last parameter is another quote character in numeric form.
At runtime, the %c
format specifiers are substituted with quotes, the %s
is substituted with the format string, and you get the original source again.
As far as I know, the order of argument evaluation is undefined. This quine works because the assignment a="main(a){printf(a,34,a=%c%s%c,34);}"
is evaluated before a
is passed as the first parameter to printf
, but as far as I know, there is no rule to enforce it. Additionally, this can't work on 64-bit platforms because the pointer-to-int conversion will truncate the pointer to a 32-bit value. As a matter of fact, even though I can see how it works on some platforms, it doesn't work on my computer with my compiler.
Upvotes: 5
Reputation: 180201
The program is supposed to print its own code. Note the similarity of the string literal to the overall program code. The idea is that the literal will be used as the printf()
format string because its value is assigned to variable a
(albeit in the argument list) and that it will also be passed as the string to print (because an assignment expression evaluates to the value that was assigned). The 34
is the ASCII code for the double quote character ("
); using it avoids a format string containing escaped literal quotation mark characters.
The code relies on unspecified behavior in the form of the order of evaluation of the function arguments. If they are evaluated in argument list order then the program is likely to fail because the value of a
would then be used as a pointer to the format string before the correct value was actually assigned to it.
Additionally, the type of a
defaults to int
, and there is no guarantee that int
is wide enough to hold an object pointer without truncating it.
Furthermore, the C standard specifies only two permitted signatures for main()
, and the signature used is not among them.
Moreover, the type of printf()
inferred by the compiler in the absence of a prototype is incorrect. It is by no means guaranteed that the compiler will generate a calling sequence that works for it.
Upvotes: 2