Reputation: 57
I would like to run my python code by passing some arguments using ArgumentParser, the parser code is like:
def parse_args(argv):
global Settings, COST_PP, COST_BP, COST_NP, COST_PN, COST_BN, COST_NN
desc = "..."
parser = argparse.ArgumentParser(description=desc)
parser.add_argument("infile", action="store")
parser.add_argument("-o", "--outfile", action="store", dest="outfile")
args = parser.parse_args(argv)
def main():
global Settings
parse_args(sys.argv[1:])
print("\t".join(sys.argv[1:]))
logging.info("SETTINGS:")
for k, v in Settings.items():
logging.info("\t\t" + str(k) + ":\t" + str(v)) ...
if __name__ == '__main__':
main()
But I got an error like this:
usage: PythonShell.py [-h] [-o OUTFILE] [-alg ALGORITHM] [-cL CLASS_LIST]
[-n RUNS] [-tf TRAIN_FRAC] [-cs COST_SET]
[-ms MULT_STRAT] [--log LOG_FILE] [-d]
infile
PythonShell.py: error: unrecognized arguments: 0 50000 1916
a91f477cb4de44dfa5d1f3dd01f8f606 2.2.0
To exit: use 'exit', 'quit', or Ctrl-D.
I wonder how to run the code correctly in the databricks notebook? Appreciate any help!
Upvotes: 1
Views: 4752
Reputation: 11
The only way I found to pass Parameters to Databricks Python script/job from Azure Data Factory was to use shlex:
import argparse, shlex
parser = argparse.ArgumentParser()
parser.add_argument('--arg1', type=str, required=True)
parser.add_argument('--arg2', type=str, required=True)
args = parser.parse_args(shlex.split(" ".join(sys.argv[1:])))
Arguments are passed by Databicks as: ["script", "--arg1 val", "--arg1 val"] while argparse expects argument names and values splitted: ["script", "--arg1", "val", "--arg1", "val"].
We first concatenate all arguments as one string and then split it with shlex to get the correct list to pass to the parse_args.
Upvotes: 1
Reputation: 1
I had a similair issue. Even when I tried to run the most basic example of the documentation. But you can run argument parser in databricks in two ways:
Pass the arguments in an array:
import argparse
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
const=sum, default=max,
help='sum the integers (default: find the max)')
## normally you can now do: parser.parse_args(), instead do this:
parser.parse_args(['23', '35'])
Place the code in a seperate notebook and run that notebook from you current notebook with the %run command. Explained here
Upvotes: 0