Reputation: 43
I am writing a C program which will take a list of commands from stdin and exec them. I am having unexpected results from using strcmp after reading in from stdin.
Here is my program test_execvp.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char const *argv[])
{
char * line_buffer[100];
size_t line_len;
int cmd_count = 0;
char * cmd_buffer[100][100];
for( line_buffer[cmd_count] = NULL; getline(&line_buffer[cmd_count], &line_len, stdin) > 0; line_buffer[++cmd_count] = NULL)
{
line_buffer[cmd_count][strcspn(line_buffer[cmd_count], "\r\n")] = 0;
int cmd = 0;
while( (cmd_buffer[cmd_count][cmd] = strsep(&line_buffer[cmd_count], " ")) != NULL )
{
cmd++;
}
}
printf("cmd_buffer[0][0]: \"%s\"\n", cmd_buffer[0][0]);
printf("cmd_buffer[0][1]: \"%s\"\n", cmd_buffer[0][1]);
printf("cmd_buffer[0][2]: \"%s\"\n", cmd_buffer[0][2]);
printf("strcmp(cmd_buffer[0][1], \"-i\") == %d\n", strcmp(cmd_buffer[0][1], "-i") );
printf("strcmp(cmd_buffer[0][1], \"-o\") == %d\n", strcmp(cmd_buffer[0][1], "-o") );
}
Now see this output:
Emil@EMIL-HP ~/Emil
$ gcc test_execvp.c -o test_execvp
Emil@EMIL-HP ~/Emil
$ cat cmdfile2
./addone –i add.txt
./addone
./addone –o add.txt
Emil@EMIL-HP ~/Emil
$ ./test_execvp < cmdfile2
cmd_buffer[0][0]: "./addone"
cmd_buffer[0][1]: "–i"
cmd_buffer[0][2]: "add.txt"
strcmp(cmd_buffer[0][1], "-i") == 181
strcmp(cmd_buffer[0][1], "-o") == 181
I don't understand how the line:
printf("strcmp(cmd_buffer[0][1], \"-i\") == %d\n", strcmp(cmd_buffer[0][1], "-i") );
can produce the output:
strcmp(cmd_buffer[0][1], "-i") == 181
if the line:
printf("cmd_buffer[0][1]: \"%s\"\n", cmd_buffer[0][1]);
produces the output:
cmd_buffer[0][1]: "–i"
Upvotes: 3
Views: 219
Reputation: 215517
Your text file contains some unicode homoglyph for -
rather than an actual -
. This is clear since 181+'-'
is 0xe2
, the lead byte for a 3-byte character.
Upvotes: 2
Reputation: 241911
If argv[1]
were "-i", then strcmp
would return 0. But it's not. Look closely and you will see that it is "–i", which is a different character. (It's longer and multibyte.)
Upvotes: 2