user1708646
user1708646

Reputation: 111

rpy2 and R debugging

After some trouble I successfully installed rpy2.

My aim is to build models (gam; library mgcv of Simon Wood) and use the predict function by passing a pandas dataframe from python through rpy2 to a gam model and retrieve the prediction.

The R script is tested by loading the txt file and process it through the same R functions as are called by the python/rpy2 script and it works fine. In the python script I start from the pickled version of the text file (as if I am in my final code, starting from a pandas dataframe).

I am also capable of triggering other errors in the R script that do make sense (passing a empty dataframe, or a dataframe with a column missing to successfully perform a prediction both trigger an error as it would in R.) I do actually get into the gam function with the input data intact.

I am close to the finish, but i keep getting this error:

Error in ExtractData(object, data, NULL) : 'names' attribute [1] must be the same length as the vector [0]

I don't know any way to get more feedback from R in my python script. How can I debug? Or can anybody point me out what might the problem in R? Or is this a part of the ".convert_to_r_dataframe()" function I do not grasp completely???

R-code:

f_clean_data <- function(df) {
        t = df
        ... some preprocessing
        t

        }

tc <- f_clean_data(t) 


f_py_gam_predict <- function(gam, df) {
                dfc = f_clean_data(df)
                result <- predict(gam, dfc)
                result
                }

bc_gam = gam(BC ~   
                +s()
                .... some gam model

        , data=tc, method="REML"
        )
summary(bc_gam)


testfile = 'a_test_file.txt'

ttest <- read.table(file=testfile ,sep='\t',header=TRUE);

result = f_py_gam_predict(bc_gam, ttest)

The f_py_gam_predict is available in the python script.

Thanks, Luc

Upvotes: 4

Views: 1651

Answers (2)

heroxbd
heroxbd

Reputation: 800

The usual R debug tools are usable from within RPy, such as

ro.r("debug(glm)")

or ro.r("options(error=recovery)")

Upvotes: 1

Jon Olav Vik
Jon Olav Vik

Reputation: 1461

Check the data type that you feed to s(). I also got Error in ExtractData(object, data, NULL) : 'names' attribute [1] must be the same length as the vector [0] when I was using a datetime explanatory variable. I worked around this by converting to number of days since start.

> library(lubridate)
> library(mgcv)
> df <- data.frame(x=today() + 1:20, y=1:20)
> gam(y~s(x), data=df)
Error in ExtractData(object, data, knots) : 
  'names' attribute [1] must be the same length as the vector [0]
> df$xnum <- (df$x - df$x[1])/ddays(1)
> str(df)
'data.frame':   20 obs. of  3 variables:
 $ x   : Date, format: "2013-04-09" "2013-04-10" "2013-04-11" "2013-04-12" ...
 $ y   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ xnum: num  0 1 2 3 4 5 6 7 8 9 ...
> gam(y~s(xnum), data=df)

The last call works okay.

As for debugging, I often call save.image() from rpy2, then load the .RData file into a plain R session for further scrutiny.

Upvotes: 2

Related Questions