Reputation: 715
I want to use the Auto
data from R package library(ISLR)
in Python.
I do some tests inspired in Introduction to rpy2 as follows:
from rpy2 import robjects
from rpy2.robjects.packages import importr, data
from rpy2.robjects import pandas2ri
pandas2ri.activate()
datasets = importr('datasets') # data(mtcars) in library(datasets)
mtcars = data(datasets).fetch('mtcars')['mtcars']
ISLR = importr('ISLR') # data(Auto) in library(ISLR)
Auto = data(ISLR).fetch('Auto')['Auto']
#r_df = mtcars # success!!!
r_df = Auto # fail???
df = pandas2ri.ri2py(robjects.DataFrame(r_df))
df.info()
Then I can test data(mtcars) in library(datasets)
successfully, while testing
data(Auto) in library(ISLR)
shows errors as
Parameter 'categories' must be list-like
How can I fix this issue?
Upvotes: 0
Views: 752
Reputation: 133
What version of rpy2 are you using? I'm using rpy2-3.3.6 installed using pip in a Conda environment with R-4.0.3 (from conda-forge) along Python-3.6.11 (from conda-forge) and I'm able to read both the mtcars from datasetsas well as Auto from ISLR. Please check the results I get below
I think the error you are seeing might either be a bug or a side-effect of the configuration / dependencies. You should upgrade your rpy2 version to the more recent >= 3.3.0 and check the dependencies carefully.
Please check this post on how the functions have changed over time with rpy2 Pandas - how to convert r dataframe back to pandas?
Here is the entire sequence from my command line:
Python 3.6.11 | packaged by conda-forge | (default, Aug 5 2020, 20:09:42) [GCC 7.5.0] on linux Type "help", "copyright", "credits" or "license" for more information.
Importing relevant libraries
>>> import rpy2.robjects as ro
>>> import rpy2.robjects.packages as rpackages
>>> from rpy2.robjects.vectors import StrVector
>>> from rpy2.robjects.packages import importr, data
Importing packages and reading in the data
>>> datasets = importr('datasets')
>>> mtcars = data(datasets).fetch('mtcars')['mtcars']
>>> ISLR = importr('ISLR')
>>> Auto = data(ISLR).fetch('Auto')['Auto']
>>> r_df_mtcars = mtcars (using labels to clarify origin of data)
>>> r_df_Auto = Auto
Converting R Data frames into Pandas Data frames
*Note* the function **conversion.rpy2py** New from rpy2 version 3.3.0
>>> pd_df_mtcars = ro.conversion.rpy2py(r_df_mtcars)
>>> pd_df_Auto = ro.conversion.rpy2py(r_df_Auto)
Examine the data using the Pandas head() for both
>>> pd_df_mtcars.head()
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 4.0 4.0
Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 4.0 4.0
Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 4.0 1.0
Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 3.0 1.0
Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 3.0 2.0
>>> pd_df_Auto.head()
mpg cylinders displacement horsepower weight acceleration year origin name
1 18.0 8.0 307.0 130.0 3504.0 12.0 70.0 1.0 chevrolet chevelle malibu
2 15.0 8.0 350.0 165.0 3693.0 11.5 70.0 1.0 buick skylark 320
3 18.0 8.0 318.0 150.0 3436.0 11.0 70.0 1.0 plymouth satellite
4 16.0 8.0 304.0 150.0 3433.0 12.0 70.0 1.0 amc rebel sst
5 17.0 8.0 302.0 140.0 3449.0 10.5 70.0 1.0 ford torino
To convert Pandas df to R df you can use:
>>> r_mtcars_df = ro.conversion.py2rpy(pd_df_mtcars)
>>> r_Auto_df = ro.conversion.py2rpy(pd_df_mtcars)
Upvotes: 2