Reputation: 49
I spent few hours reading tutorials about argparse and managed to learn to use normal parameters. The official documentation is not very readable to me. I'm new to Python. I'm trying to write a program that could be invoked in following ways:
cat inFile | program [options] > outFile
-- If no inFile or outfile is specified, read from stdin and output to stdout.
program [options] inFile outFile
program [options] inFile > outFile
-- If only one file is specified it is input and output should go to stdout.
cat inFile | program [options] - outFile
-- If '-' is given in place of inFlie read from stdin.
program [options] /path/to/folder outFile
-- Process all files from /path/to/folder
and it subdirectories.
I want it to behave like regular cli program under GNU/Linux.
It would be also nice if the program would be able to be invoked:
program [options] inFile0 inFile1 ... inFileN outFile
-- first path/file always interpreted as input, last one always interpreted as output. Any additional ones interpreted as inputs.
I could probably write dirty code that would accomplish this but this is going to be used, so someone will end up maintaining it (and he will know where I live...).
Any help/suggestions are much appreciated.
Combining answers and some more knowledge from the Internet I've managed to write this(it does not accept multiple inputs but this is enough):
import sys, argparse, os.path, glob
def inputFile(path):
if path == "-":
return [sys.stdin]
elif os.path.exists(path):
if os.path.isfile(path):
return [path]
else:
return [y for x in os.walk(path) for y in glob.glob(os.path.join(x[0], '*.dat'))]
else:
exit(2)
def main(argv):
cmdArgsParser = argparse.ArgumentParser()
cmdArgsParser.add_argument('inFile', nargs='?', default='-', type=inputFile)
cmdArgsParser.add_argument('outFile', nargs='?', default='-', type=argparse.FileType('w'))
cmdArgs = cmdArgsParser.parse_args()
print cmdArgs.inFile
print cmdArgs.outFile
if __name__ == "__main__":
main(sys.argv[1:])
Thank you!
Upvotes: 4
Views: 765
Reputation: 231665
I'll give you a start script to play with. It uses optionals
rather than positionals
. and only one input file. But it should give a taste of what you can do.
import argparse
parser = argparse.ArgumentParser()
inarg = parser.add_argument('-i','--infile', type=argparse.FileType('r'), default='-')
outarg = parser.add_argument('-o','--outfile', type=argparse.FileType('w'), default='-')
args = parser.parse_args()
print(args)
cnt = 0
for line in args.infile:
print(cnt, line)
args.outfile.write(line)
cnt += 1
When called without arguments, it just echos your input (after ^D). I'm a little bothered that it doesn't exit until I issue another ^D.
FileType
is convenient, but has the major fault - it opens the files, but you have to close them yourself, or let Python do so when exiting. There's also the complication that you don't want to close stdin/out.
The best argparse
questions include a basic script, and specific questions on how to correct or improve it. Your specs are reasonably clear. but it would be nice if you gave us more to work with.
To handle the subdirectories option, I would skip the FileType
bit. Use argparse
to get 2 lists of strings (or a list and an name), and then do the necessary chgdir
and or glob
to find and iterate over files. Don't expect argparse
to do the actual work. Use it to parse the commandline strings. Here a sketch of such a script, leaving most details for you to fill in.
import argparse
import os
import sys # of stdin/out
....
def open_output(outfile):
# function to open a file for writing
# should handle '-'
# return a file object
def glob_dir(adir):
# function to glob a dir
# return a list of files ready to open
def open_forread(afilename):
# function to open file for reading
# be sensitive to '-'
def walkdirs(alist):
outlist = []
for name in alist:
if <name is file>;
outlist.append(name)
else <name is a dir>:
glist = glob(dir)
outlist.extend(glist)
else:
<error>
return outlist
def cat(infile, outfile):
<do your thing here>
def main(args):
# handle args options
filelist = walkdirs(args.inlist)
fout = open_outdir(args.outfile)
for name in filelist:
fin = open_forread(name)
cat(fin,fout)
if <fin not stdin>: fin.close()
if <fout not stdout>: fout.close()
if '__name__' == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('inlist', nargs='*')
parser.add_argument('outfile')
# add options
args = parser.parse_args()
main(args)
The parser
here requires you to give it an outfile
name, even if it is '-'. I could define its nargs='?'
to make it optional. But that does not play nicely with the 'inlist` '*'.
Consider
myprog one two three
Is that
namespace(inlist=['one','two','three'], outfile=default)
or
namespace(inlist=['one','two'], outfile='three')
With both a *
and ?
positional, the identity of the last string is ambiguous - is it the last entry for inlist
, or the optional entry for outfile
? argparse
chooses the former, and never assigns the value to outfile
.
With --infile
, --outfile
definitions, the allocation of these strings is clear.
In sense this problem is too complex for argparse
- there's nothing in it to handle things like directories. In another sense it is too simple. You could just as easily split sys.argv[1:]
between inlist
and outfile
without the help of argparse
.
Upvotes: 0
Reputation: 16146
You need a positional argument (name not starting with a dash), optional arguments (nargs='?'
), a default argument (default='-'
). Additionally, argparse.FileType
is a convenience factory to return sys.stdin
or sys.stdout
if -
is passed (depending on the mode).
All together:
#!/usr/bin/env python
import argparse
# default argument is sys.argv[0]
parser = argparse.ArgumentParser('foo')
parser.add_argument('in_file', nargs='?', default='-', type=argparse.FileType('r'))
parser.add_argument('out_file', nargs='?', default='-', type=argparse.FileType('w'))
def main():
# default argument is is sys.argv[1:]
args = parser.parse_args(['bar', 'baz'])
print(args)
args = parser.parse_args(['bar', '-'])
print(args)
args = parser.parse_args(['bar'])
print(args)
args = parser.parse_args(['-', 'baz'])
print(args)
args = parser.parse_args(['-', '-'])
print(args)
args = parser.parse_args(['-'])
print(args)
args = parser.parse_args([])
print(args)
if __name__ == '__main__':
main()
Upvotes: 2