Integer datatypes with missing values changes to object in python using pyreadr package, after importing data from RData file

Question

I want to execute some python functions using data from '.RData' file. I am using the 'pyreadr' python package for the same.

Here is example of R Code

library(data.table)

# Example 
data <- data.table(x_num=c(1,1.5,2),
                   x_int=c(1,2,3))
data$x_int <- as.integer(data$x_int) # Making sure the data is in integer type


data_missing <- data.table(x_num=c(1.5,2,NA,5,6),
                   x_int=c(1,2,3,NA,5))
data_missing$x_int <- as.integer(data_missing$x_int) # Making sure the data is in integer type

# checking the classes
sapply(data,class)
sapply(data_missing,class)

# Storing the data in RData file 
save(data, file = "test_data.RData")
save(data_missing, file = "test_missing_data.RData")

The reason I am storing it in different files is because the 'test_data.RData' is successfully loaded in python, however the 'test_missing_data.RData' is converting values with NA data to object rather than integer datatype.

Here is the Python Code

# Working example
import pyreadr
result=pyreadr.read_r('test_data.RData')
data=result['data']
data.dtypes
# Output
# x_num    float64
# x_int      int32

# Example where NA values are converted to object datatype
import pyreadr
result=pyreadr.read_r('test_missing_data.RData') # Error 

data=result['data_missing']
data.dtypes
# Output
# x_num    float64
# x_int     object

There is no error message, however I need the datatype to remain in integer even with missing or NA values.

Thank you for your time and help.

Integer datatypes with missing values changes to object in python using pyreadr package, after importing data from RData file

Answers (1)

Related Questions