Bill Evans at Mariposa
Bill Evans at Mariposa

Reputation: 3858

python: obtaining the OS's argv[0], not sys.argv[0]

(This question was asked here, but the answer was Linux-specific; I'm running on FreeBSD and NetBSD systems which (EDIT: ordinarily) do not have /proc.)

Python seems to dumb down argv[0], so you don't get what was passed in to the process, as a C program would. To be fair, sh and bash and Perl are no better. Is there any way I can work around this, so my Python programs can get that original value? I have administrative privileges on this FreeBSD system, and can do things like changing everyone's default PATH environment variable to point to some other directory before the one that contains python2 and python3, but I don't have control over creating /proc. I have a script which illustrates the problem. First, the script's output:

the C child program gets it right: arbitrary-arg0 arbitrary-arg1
the python2 program dumbs it down: ['./something2.py', 'arbitrary-arg1']
the python3 program dumbs it down: ['./something3.py', 'arbitrary-arg1']
the sh script       dumbs it down: ./shscript.sh arbitrary-arg1
the bash script     dumbs it down: ./bashscript.sh arbitrary-arg1
the perl script drops arg0:        ./something.pl arbitrary-arg1

... and now the script:

#!/bin/sh

set -e
rm -rf work
mkdir work
cd work
cat > childc.c << EOD; cc childc.c -o childc
#include <stdio.h>
int main(int    argc,
         char **argv
        )
{
  printf("the C child program gets it right: ");
  printf("%s %s\n",argv[0],argv[1]);
}
EOD
cat > something2.py <<EOD; chmod 700 something2.py
#!/usr/bin/env python2
import sys
print "the python2 program dumbs it down:", sys.argv
EOD
cat > something3.py <<EOD; chmod 700 something3.py
#!/usr/bin/env python3
import sys
print("the python3 program dumbs it down:", sys.argv)
EOD
cat > shscript.sh <<EOD; chmod 700 shscript.sh
#!/bin/sh
echo "the sh script       dumbs it down:" \$0 \$1
EOD
cat > bashscript.sh <<EOD; chmod 700 bashscript.sh
#!/bin/sh
echo "the bash script     dumbs it down:" \$0 \$1
EOD
cat > something.pl <<EOD; chmod 700 something.pl
#!/usr/bin/env perl
print("the perl script drops arg0:        \$0 \$ARGV[0]\n")
EOD
cat > launch.c << EOD; cc launch.c -o launch; launch
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int    argc,
         char **argv,
         char **arge)
{
  int    child_status;
  size_t program_index;
  pid_t  child_pid;

  char  *program_list[]={"./childc",
                         "./something2.py",
                         "./something3.py",
                         "./shscript.sh",
                         "./bashscript.sh",
                         "./something.pl",
                         NULL
                        };

  char  *some_args[]={"arbitrary-arg0","arbitrary-arg1",NULL};

  for(program_index=0;
      program_list[program_index];
      program_index++
     )
  {
    child_pid=fork();

    if(child_pid<0)
    {
      perror("fork()");
      exit(1);
    }
    if(child_pid==0)
    {
      execve(program_list[program_index],some_args,arge);
      perror("execve");
      exit(1);
    }
    wait(&child_status);
  }

  return 0;
}
EOD

Upvotes: 7

Views: 1003

Answers (3)

vz0
vz0

Reputation: 32923

Use Python's ctypes module to get the "program name" which by default is set to argv[0]. See Python source code here. For example:

import ctypes

GetProgramName = ctypes.pythonapi.Py_GetProgramName
GetProgramName.restype = ctypes.c_wchar_p

def main():
    print(GetProgramName())

if __name__ == '__main__':
    main()

Running the command prints:

$ exec -a hello python3 name.py 
hello

Upvotes: 1

kabanus
kabanus

Reputation: 25980

What I think is the path of least resistance here is a bit hacky, but would probably work on any OS. Basically you double wrap your Python calls. First (using Python 3 as an example), the Python3 in your path is replaced by a small C program, which you know you can trust:

#include<stdlib.h>
#include<string.h>
int main(int argc, char **argv) {
    // The python 3 below should be replaced by the path to the original one
    // In my tests I named this program wrap_python so there was no problem
    // but if you are changing this system wide (and calling the wrapper python3
    //  you can't leave this.
    const char *const program = "python3 wrap_python.py";
    size_t size = strlen(program) + 1; // Already added null character at end
    for(int count = 0; count < argc; ++count)
        size += strlen(argv[count]) + 1; // + 1 for space

    char *cmd = malloc(size);
    if(!cmd) exit(-1);
    cmd[0] = '\0';
    strcat(cmd, program);
    for(int count = 1; count < argc; ++count) {
        strcat(cmd, " ");
        strcat(cmd, argv[count]);
    }
    strcat(cmd, " ");
    strcat(cmd, argv[0]);
    return system(cmd);
}

You can make this faster, but hey, premature optimization?

Note we are calling a script called wrap_python.py (probably you would need a full path here). We want to pass the "true" argv, but we need to work some in the Python context to make it transparent. The true argv[0] is passed as a last argument, and wrap_python.py is:

from sys import argv
argv[0] = argv.pop(-1)
print("Passing:", argv) # Delete me
exit(exec(open(argv[1]).read())) # Different in Python 2. Close the file handle if you're pedantic.

Our small wrapper replaces argv[0] with the one provided by our C wrapper removing it from the end, and then manually executes in the same context. Specifically __name__ == __main__ is true.

This would be run as

python3 my_python_script arg1 arg2 etc...

where your path now will point to that original C program. Testing this on

import sys
print(__name__)
print("Got", sys.argv)

yields

__main__
Got ['./wrap_python', 'test.py', 'hello', 'world', 'this', '1', '2', 'sad']

Note I called my program wrap_python - you want to name it python3.

Upvotes: 1

Bill Evans at Mariposa
Bill Evans at Mariposa

Reputation: 3858

What follows is a generally useful answer to what I meant to ask.

The answer that kabanus gave is excellent, given the way I phrased the problem, so of course he gets the up-arrow and the checkmark. The transparency is a beautiful plus, in my opinion.

But it turns out that I didn't specify the situation completely. Each python script starts with a shebang, and the shebang feature makes it more complicated to launch a python script with an artificial argv[0].

Also, transparency isn't my goal; backward compatibility is. I would like the normal situation to be that sys.argv works as shipped, right out of the box, without my modifications. Also, I would like any program which launches a python script with an artificial argv[0] not to have to worry about any additional argument manipulation.

Part of the problem is to overcome the "shebang changing argv" problem.

The answer is to write a wrapper in C for each script, and the launching program launches that program instead of the actual script. The actual script looks at the arguments to the parent process (the wrapper).

The cool thing is that this can work for script types other than python. You can download a proof of concept here which demonstrates the solution for python2, python3, sh, bash, and perl. You'll have to change each CRLF to LF, using dos2unix or fromdos. This is how the python3 script handles it:

def get_arg0():
    return subprocess.run("ps -p %s -o 'args='" % os.getppid(),
                          shell=True,
                          stdout=subprocess.PIPE,
                          stderr=subprocess.PIPE
                         ).stdout.decode(encoding='latin1').split(sep=" ")[0]

The solution does not rely on /proc, so it works on FreeBSD as well as Linux.

Upvotes: 2

Related Questions