python152
python152

Reputation: 1951

Limit number of arguments passed in

I am using argparse to take a list of input files:

import argparse
p = argparse.ArgumentParser()
p.add_argument("infile", nargs='+', type=argparse.FileType('r'),  help="copy from")
p.add_argument("outfile", help="copy to")
args = p.parse_args()

However, this opens the door for user to pass in prog /path/to/* outfile, where the source directory could potentially have millions of file, the shell expansion can overrun the parser. My questions are:

  1. is there a way to disable the shell expansion (*) within?

  2. if not, if there a way to put a cap on the number of input files before it is assembled into a list?

Upvotes: 0

Views: 1769

Answers (2)

hpaulj
hpaulj

Reputation: 231510

If you are concerned about too many infile values, don't use FileType.

p.add_argument("infile", nargs='+', help="copy from")

Just accept a list of file names. That's not going to cost you much. Then you can open and process just as many of the files as you want.

FileType opens the file when the name is parsed. That is ok for a few files that you will use right away in small script. But usually you don't want, or need, to have all those files open at once. In modern Python you are encouraged to open files in a with context, so the get closed right away (instead of hanging around till the script is done).

FileType handles the '-', stdin, value. And it will issue a nice error report if it fails to open a file. But is that what you want? Or would you rather process each file, skipping over the bad names.

Overall FileType is a convenience, but generally a poor choice in serious applications.

Something else to be worried about - outfile is the last of a (potentially) long list of files, the '+' input ones and 1 more. argparse accepts that, but it could give problems. For example what if the user forgets to provide an 'outfile'? Then the last of input files will be used as the outfile. That error could result in unintentionally over writing a file. It may be safer to use '-o','--outfile',, making the user explicitly mark the outfile. And the user could give it first, so he doesn't forget.

In general '+' and '*' positionals are safest when used last.

Upvotes: 0

Klaus D.
Klaus D.

Reputation: 14369

(1) no, the shell expansion is done by the shell. When Python is run, the command line is expanded already. The use "*" or '*' will deactivate it but that also happens on the shell.

(2) Yes, get the length of sys.argv early in your code and exit if it is too long.

Also most shells have a built-in limit to the expansion.

Upvotes: 1

Related Questions