bli00
bli00

Reputation: 2827

How does C parse command line arguments?

I'm curious as to what C does exactly to parse command line arguments. For example, assume I have a program named myProgram that takes in two arguments like this

./myProgram arg1 arg2

If I were to call

./myProgram arg1$'\0otherstuff' arg2

arg1 and arg2 would still print if we were to print argv[1] and argv[2], ignoring $'\0otherstuff', but where does it go? Is it store in memory behind arg1? Could it potentially overwrite any buffer? How is arg2 read if there's a null character before it?

Upvotes: 0

Views: 658

Answers (3)

Eric Postpischil
Eric Postpischil

Reputation: 224596

Experimenting with bash (version 3.2.57(1)-release (x86_64-apple-darwin17)) suggests that the “otherstuff” in your example is not passed to the program. When a program is called with the command line you show, the memory pointed to by argv[1] contains “arg1”, then a null character, then “arg2”. Thus, the null and “otherstuff” in your command line has not been passed to the program.

(Hypothetically: If the shell were to pass it to the program, I would expect it would pass it in the memory continuing from that pointed to by argv[1], and there would be no danger of it overwriting any buffer. If the shell were designed to tolerate an embedded null character in an argument, I expect (based on how we design things) that it would treat the argument as a complete string and provide the necessary space to hold it.)

The fact that the argument prior to “arg2” contains a null character is irrelevant to the handling of “arg2”. After initial processing of the command line, the shell does not treat the line as one string. It has divided it into words or other units and handles them with its own data structures. So the presence of null characters in prior arguments has no effect on later arguments.

Additionally, it may not be possible for the shell to pass an argument containing an embedded null character. The routines typically used to execute a program, such as execl, accept the arguments as null-terminated strings. So the embedded null terminates the string, and the execl routine never passes anything beyond the null character.

Upvotes: 0

KamilCuk
KamilCuk

Reputation: 142080

Converting ./myProgram arg1 arg2 into a C style int argc, char *argv[] is done by the operating system or by shell (it depends). C does not parse the arguments, you parse the arguments in C. C is a programming language, not entity. The form int argc, char *argc[] is used in the C programming language as the arguments passed to the main function, but other programming languages may use a different form, for C see main_function.
In linux, one may use execve system call to specify arguments passed to a function. Parsing from the form ./myProgram arg1 arg2 to execve arguments is done by the shell (e.g. bash), which constructs argv array and passes arguments to execve call.
Your shell is probably ignoring the part $'\0otherstuff', because under POSIX flename cannot contain the NUL character (assuming your shell is POSIX compatible).

Upvotes: 1

Domso
Domso

Reputation: 970

When calling an executable, your OS kernel will take the additional arguments (as plain text) and pass them into the program memory. Before the main function is called, a small code is executed, which passes the given arguments to the actual main function in C.

Upvotes: 0

Related Questions