Reputation: 425
I have basic R script that performs a GLM on a MySQL dataset. This runs fine using Rscript in bash. However I would like to call it within a python script so I can add it to a loop, I can create the sql statement but I can't seem to pass it to R using rpy2;
for word in words:
sql_scores = "select a.article_id, response, score from scores as a join profile as b on a.article_id = b.article_id where response in (1,0) and keyword = '%s';" % (word[0])
robjects.r("library(RMySQL)")
robjects.r("mydb = dbConnect(MySQL(), user='me', password='xxxx', host='aws.host', dbname='mydb')")
robjects.r("results = fetch(dbSendQuery(mydb, '%s'))") % (sql_scores)
robjects.r("model <- glm(response ~ score , data=results, family=binomial)")
robjects.r("summary(model)")
If I print sql_scores I can run this fine directly in MySQL. However Python produces this error;
Loading required package: DBI
Traceback (most recent call last):
File "keyword_searcher.py", line 30, in <module>
robjects.r("results = fetch(dbSendQuery(mydb, '%s'))") % (sql_scores)
File "/usr/local/lib/python2.7/dist-packages/rpy2/robjects/__init__.py", line 268, in __call__
p = rinterface.parse(string)
ValueError: Error while parsing the string.
I can't figure out the proper syntax for:
robjects.r("results = fetch(dbSendQuery(mydb, %s))") % (sql_scores)
Upvotes: 1
Views: 4395
Reputation: 497
You can access variables in the R environment with robjects.globalenv['varname']
.
So an alternative way is:
robjects.globalenv['sql_scores'] = sql_scores
robjects.r("results = fetch(dbSendQuery(mydb, sql_scores))")
Upvotes: 2
Reputation: 1267
Use double quotes around the "%s"
and single quotes around the robjects.r string:
robjects.r('results = fetch(dbSendQuery(mydb, "%s"))') % (sql_scores)
or use the format()
method:
robjects.r('fetch(dbSendQuery(mydb, {0}))'.format(sql_scores))
Upvotes: 3