Reputation: 551
I want to execute some python functions using data from '.RData' file. I am using the 'pyreadr' python package for the same.
Here is example of R Code
library(data.table)
# Creating demo data frame
data <- data.table(x_time = c(Sys.time(),Sys.time()+1,Sys.time()+2))
data_missing <- data.table(x_time = c(Sys.time(),NA,NA))
# checking the classes
sapply(data,class)
sapply(data_missing,class)
# Storing the data in RData file
save(data, file = "test_data.RData")
save(data_missing, file = "test_missing_data.RData")
The reason I am storing it in different files is because the 'test_data.RData' is successfully loaded in python, however the 'test_missing_data.RData' is giving the an error.
Here is the Python Code
## Working demo
# import pyreadr
# result=pyreadr.read_r('test_data.RData')
# data=result['data']
# data.dtypes
# print(data)
### Error in below
import pyreadr
result=pyreadr.read_r('test_missing_data.RData') # Error
data=result['data']
data.dtypes
print(data)
The error message is as below:
C:\Users\Pawan\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\pandas\core\tools\datetimes.py:530: RuntimeWarning: invalid value encountered in multiply arr, tz_parsed = tslib.array_with_unit_to_datetime(arg, unit, errors=errors)
The error occurs when there are NA values in the data frame. Is there other way load RData files in python ?
Thank you for your time and help.
Upvotes: 0
Views: 282
Reputation: 3417
It is not an error, it is a warning, meaning it is probably not affecting your results. After running your R code, I can read the RData files without issue, notice that the name of the dataframe, you got it wrong in your code
import pyreadr
result=pyreadr.read_r('test_missing_data.RData') # No error, just warning
# Your data frame is called data_missing, not data, since you called like that in your R code,
# I think this is what you are doing wrong
# Check data.keys() to see what you have if you are not sure
data=result['data_missing']
data.dtypes
#x_time datetime64[ns]
#dtype: object
print(data)
# x_time
#0 2022-08-03 09:37:55.963370752
#1 NaT
#2 NaT
# Looks correct to me
Upvotes: 1