Reputation: 287
Caue:
I'm creating dataframes programmatically in Python using globals()
.
In the below code, I'm creating 5 datasets that starts with a 'PREFIX' in caps, followed by a letter then ending with a suffix.
R
library(reticulate)
repl_python()
Python
import os
import pandas as pd
letters = ('a','b','c','d','e')
df_names = []
for ele in letters:
globals()['PREFIX_{}_suffix'.format(ele)] = pd.DataFrame(columns = ['col_a', 'col_b']).astype(str)
df_names.append(['PREFIX_{}_suffix'.format(ele)][0])
print(df_names)
['PREFIX_a_suffix', 'PREFIX_b_suffix', 'PREFIX_c_suffix', 'PREFIX_d_suffix', 'PREFIX_e_suffix']
Request:
I would like to select dataframes starting with a prefix (ideally with regular expression ^PREFIX
) and move those specific dataframes from reticulate's python environment to R environment programmatically.
For the sake of the task, I have added the dataframes variable names into df_names
. However, using regex is highly encouraged.
I know the variables are stored in py
object that can be accessed with a $
.. but I'm not sure how to select dataframes iteratively and move those dataframes from python's environment to R's environment programmatically all at once.
In R, I usually use ls(pattern=<regex>)
to select objects in R environment.
In Python, you can list the variables using locals()
, see this thread.
This thread discuss passing python functions from R to python.
Upvotes: 3
Views: 431
Reputation: 287
Here is my solution using regex:
In python:
dir()
output, which captures the defined variables in your python's environmentimport os
import re
r = re.compile("^PREFIX")
py_dfs = list(filter(r.match, dir())) # fetch defined variables from python's env
print(py_dfs)
['PREFIX_a_suffix', 'PREFIX_b_suffix', 'PREFIX_c_suffix', 'PREFIX_d_suffix', 'PREFIX_e_suffix']
In R:
reticulate::py_eval
evaluate your python object converting it to r using reticulate::py_to_r
assign
to assign dynamic defined variables with the same name
of the variables (dataframes) in pythonfor (df in py$py_dfs){
name = df
r_df = py_to_r(py_eval(df))
assign(paste0(name), r_df)
}
> ls(pattern="^PREFIX")
[1] "PREFIX_a_suffix" "PREFIX_b_suffix" "PREFIX_c_suffix" "PREFIX_d_suffix" "PREFIX_e_suffix"
> dim(PREFIX_a_suffix)
[1] 0 2
> class(PREFIX_a_suffix)
[1] "data.frame"
>
Upvotes: 1