Reputation: 1
I am trying to use rpy2 to call the R package MatchIt. I am having difficulty seeing the outcome of the matched pairs from the $match.matrix. Here is the R code I am trying to execute in python.
matched <- cbind(lalonde[row.names(foo$match.matrix),"re78"],lalonde[foo$match.matrix,"re78"])
Here is my python code:
import readline
import rpy2.robjects
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
from rpy2 import robjects as ro
import numpy as np
from scipy.stats import ttest_ind
import pandas as pd
from pandas import Series,DataFrame
pandas2ri.activate()
R = ro.r
MatchIt = importr('MatchIt')
base = importr('base')
df = R('lalonde')
lalonde = pandas2ri.py2ri(df)
formula = 'treat ~ age + educ + black + hispan + married + nodegree + re74 + re75'
foo = MatchIt.matchit(formula = R(formula),
data = lalonde,
method = R('"nearest"'),
ratio = 1)
matched = \
base.cbind(lalonde.rx[base.row_names(foo.rx2('match.matrix')),"re78"],
lalonde.rx[foo.rx2('match.matrix'),"re78"])
This chunk runs :
lalonde.rx(base.row_names(foo.rx2('match.matrix')),
"re78")
but this chunk
lalonde.rx[foo.rx2('match.matrix'),"re78"].
returns an error of:
ValueError: The first parameter must be a tuple.
The output of
cbind(lalonde[row.names(foo$match.matrix),"re78"], lalonde[foo$match.matrix,"re78"])
should be a dataframe which matches the row names and cell values of foo$match.matrix with the values of "re78" in the lalonde dataframe
Upvotes: 0
Views: 1054
Reputation: 11545
Here lalonde
is defined elsewhere (but thanks to @Parfait's question we know that this is a data frame). Now you'll have to break down your one-liner triggering the error to pinpoint the exact place of trouble (and we can't do that for you - the thing about self-contained and reproducible examples is that they are helping us help you).
matched = \
base.cbind(lalonde[base.row_names(foo.rx2('match.matrix')),"re78"],
lalonde[foo.rx2('match.matrix'),"re78"])
Is this breaking with the first subset of lalonde
?
lalonde[base.row_names(foo.rx2('match.matrix')),"re78"]
Since type(lalonde)
is rpy2.robjects.vectors.DataFrame
this is an R/rpy2 data frame. Extracting a subset like one would do it in R can be achieved with .rx
(as in r-style extraction - see http://rpy2.readthedocs.io/en/version_2.8.x/vector.html#extracting-r-style
).
lalonde.rx(base.row_names(foo.rx2('match.matrix')),
"re78")
It is important to understand what is happening with this call. By default the elements to extract in each direction of the data structure (here rows and columns of the data frame respectively) must be R vectors (vector of names, or vector of one-offset index integers) or a Python data structure that the conversion mechanism can translate into an R vector (of names or integers). base.row_names
will return the row names (and that's a vector of names) but foo.rx2('match.matrix')
might be something else.
Here type(foo.rx2('match.matrix'))
is indicating that this is a matrix. Using matrices can be used be used to cherry pick cells in an R array, but in that case there can only be one parameter for the extraction... and we presently have two (the second is "re78"
).
Since the first column of that match.matrix
contains the indices (row numbers) in lalonde
, the following should be what you want:
matched = \
base.cbind(lalonde.rx[base.row_names(foo.rx2('match.matrix')),"re78"],
lalonde.rx[foo.rx2('match.matrix').rx(True, 1),"re78"])
Upvotes: 2