Reputation: 601
I want to run an R script in python using rpy2, I already know how to do this
The R code is:
dataR = data.frame( Ingresos = c(23,45,24,23,54),
Bonos = c(23,45,12,67,54),
Deuda = c(23,4,1,6,3),
row.names = c("Nathy", "Tomas", "Joe", "Emily", "Javi") )
dataR
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing
To run this R script in python I use:
import rpy2
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
r = robjects.r
output = r.source("R_script_run_in_python.R")
output
And output gets the last value from my R code
Now I want to run the same code, but using a data that I define in python, for example:
import pandas as pd
df = pd.DataFrame( np.random.randn(5,3),
columns = ["Ingresos","Bonos","Deuda"],
index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )
So the R code I want tu run now is just:
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
Max_Ing
But dataR being df, how can I do that?
Upvotes: 2
Views: 1715
Reputation: 601
I tried this and it worked
# Data
# Pandas dataframe
df = pd.DataFrame( np.random.randn(5,3),
columns = ["Ingresos","Bonos","Deuda"],
index = ["Max", "Nathy", "Tom", "Joe", "Kathy"] )
# rpy2 datframe
dataR = pandas2ri.py2ri(df)
# R code
robjects.globalenv["dataR"] = dataR
robjects.r('''
promedio_ingresos = mean(dataR$Ingresos)
Max_Ing = sort(dataR$Ingresos[dataR$Ingresos>promedio_ingresos])
''')
print(robjects.globalenv["dataR"])
print(robjects.globalenv["promedio_ingresos"])
print(robjects.globalenv["Max_Ing"])
Upvotes: 2