Slow glm calculation when using rpy2

Question

I want to calculate logistic regression parameters using R's glm package. I'm working with python and using rpy2 for that. For some reason, when I'm running the glm function using R I get much faster results than by using rpy2. Do you know why the calculations using rpy2 is much slower? I'm using R - V2.13.1 and rpy2 - V2.0.8 Here is the code I'm using:

import numpy
from rpy2 import robjects as ro
import rpy2.rlike.container as rlc

def train(self, x_values, y_values, weights):
        x_float_vector = [ro.FloatVector(x) for x in numpy.array(x_values).transpose()]
        y_float_vector = ro.FloatVector(y_values)   
        weights_float_vector = ro.FloatVector(weights)
        names = ['v' + str(i) for i in xrange(len(x_float_vector))]
        d = rlc.TaggedList(x_float_vector + [y_float_vector], names + ['y'])
        data = ro.RDataFrame(d)
        formula = 'y ~ '
        for x in names:
            formula += x + '+'
        formula = formula[:-1]
        fit_res = ro.r.glm(formula=ro.r(formula), data=data, weights=weights_float_vector,  family=ro.r('binomial(link="logit")'))

Slow glm calculation when using rpy2

Answers (1)

Related Questions