Reputation: 73
I'm running a python script within Jupyter notebook and attempting to loop it over a list. However the command line variables are not being recognised as variables, rather being taken as strings. This question and this question seem to be similar to what I want but I have no experience with using argparse so do not know where to start.
My code:
import got
retailers = ["handle1", "handle2"]
for retailer in retailers:
string = "keyword " + "@"+ retailer
file_name = "keyword_" + retailer
%run Exporter.py --querysearch string --since 2018-01-01 --maxtweets 50 --output file_name
What it looks like when running from command line:
python Exporter.py --querysearch "keyword @retailer" --since 2018-01-01 --maxtweets 50 --output "keyword_retailer"
The problem is that the script Exporter.py is searching for the term "retailer" and not actually what I want, which is "keyword @Retailer". Same for the output file, which is being saved as "file_name" and not "keyword_retailer".
Any ideas on how I can solve this?
For context if it is needed, I am using this package.
EDIT:
I have added this to my code however I get the error listed below. I've also attached the module Exporter.py as I can't seem to fix this error.
import argparse
import sys
import Exporter
def main(args):
# parse arguments using optparse or argparse or what have you
parser = argparse.ArgumentParser(description="Do something.")
parser.add_argument("--querysearch", type=str, default= 2, required=True)
parser.add_argument("--maxtweets", type=int, default= 4, required=True)
parser.add_argument("--output", type=str, default= 4, required=True)
parser.add_argument("--since", type=int, default= 4, required=True)
if __name__ == '__main__':
import sys
main(sys.argv[1:])
for retailer in retailers:
string = "palm oil " + "@"+ retailer
file_name = "palm_oil_" + retailer
#print string
#print file_name
Exporter.main([string,"2018-01-01", 50, file_name])
Error message:
UnboundLocalError Traceback (most recent call last)
<ipython-input-35-4731f5aa548f> in <module>()
4 #print string
5 #print file_name
----> 6 Exporter.main([string,"2018-01-01", "50", file_name])
/Users/jamesozden/GetOldTweets-python-master/Exporter.pyc in main(argv)
70 got.manager.TweetManager.getTweets(tweetCriteria, receiveBuffer)
71
---> 72
73 finally:
74 outputFile.close()
UnboundLocalError: local variable 'arg' referenced before assignment
I have also tried this style solution with {}'s to denote variables not strings, as per the answer to another question but no success:
!python training.py --cuda --emsize 1500 --nhid 1500 --dropout {d} --epochs {e}
Upvotes: 2
Views: 5485
Reputation: 231698
Your description is unclear as to when it's running a script from shell, and when from a ipython
(or notebook
) using %run
, so I'll focus on the argparse
problems:
First this needs a parse_args
:
def main(argv):
# parse arguments using optparse or argparse or what have you
parser = argparse.ArgumentParser(description="Do something.")
parser.add_argument("--querysearch", type=str, default= 2, required=True)
parser.add_argument("--maxtweets", type=int, default= 4, required=True)
parser.add_argument("--output", type=str, default= 4, required=True)
parser.add_argument("--since", type=int, default= 4, required=True)
args = parse_args(argv)
print(args) # a good debugging step
return args # or do something with them
The argv
parameter will need to look like something it would get via sys.argv[1:]
a list like:
['--querysearch', "keyword @retailer", 'since', '2018-01-01', ...]
I was going to use split()
on
'--querysearch "keyword @retailer" --since 2018-01-01 --maxtweets 50 --output "keyword_retailer"'
but it won't handle the embedded space after 'keyword'. (lexsplit
can).
If you make all those arguments required
there's no point in providing default
parameters. Conversely, provide the default and drop the required. And defaults like 4
for arguments with type=str
are not a good idea. They work, but could mess up further processing (args.output
a string or a number?).
Another way to 'bypass' the parser is to define a Namespace
object:
args = argparse.Namespace(querysearch='foo', maxtweets=4, output='afile', since=4)
Upvotes: 1