Reputation: 19375
Consider the following dataframe
df = pd.DataFrame({'group1' : ['A', 'A', 'A', 'A',
'A', 'A', 'A', 'A'],
'group2' : ['C', 'C', 'C', 'C',
'C', 'E', 'E', 'E'],
'time' : [-6,-5,-4,-3,-2,-6,-3,-4] ,
'col': [1,2,3,4,5,6,7,8]})
df
Out[36]:
col group1 group2 time
0 1 A C -6
1 2 A C -5
2 3 A C -4
3 4 A C -3
4 5 A C -2
5 6 A E -6
6 7 A E -3
7 8 A E -4
my objective is to create a column that contains, for each group in ['group1','group2']
the ratio of col
evaluated at time = -6
divided by col
evaluated at time = -4
.
That is, for group ['A','C']
, I expect this column to be equal to 1/3, for group ['A','E']
it is 6/8. Both group1
and group1
take on many different values in the data.
How can I get that in Pandas?
Something like
df.groupby(['group1','group2']).transform(lambda x: x.ix[x['time'] == -6,'col'] / x.ix[x['time'] == -4,'col'])
does not work.. Any ideas?
Thanks!
Upvotes: 2
Views: 764
Reputation: 29711
Another way using groupby
with a custom function:
def time_selection(row):
N_r = row.loc[row['time'] == -6, 'col'].squeeze()
D_r = row.loc[row['time'] == -4, 'col'].squeeze()
return (N_r/D_r)
pd.merge(df, df.groupby(['group1','group2']).apply(time_selection).reset_index(name='div'))
Upvotes: 1
Reputation: 7903
Your solution in a ridiculously long list iteration (most pythonic way btw). Also, your question makes sense but the ratio for group A,C you have listed as 1/4 is actually 1/3
summary = [(name,group[group.time == -6].col.values[0],group[group.time == -4].col.values[0]) for name,group in df.groupby(['group1','group2'])]
pd.DataFrame(summary, columns=['group', 'numerator', 'denominator'])
Upvotes: 1
Reputation: 3212
You could do it without groupby
like this:
dfm = pd.merge(df[df.time == -4],df[df.time == -6],on=["group1","group2"])
dfm['Div'] = dfm.col_y.div(dfm.col_x)
df = pd.merge(df,dfm[['group1','group2','Div']],on=["group1","group2"])
Output:
col group1 group2 time Div
0 1 A C -6 0.333333
1 2 A C -5 0.333333
2 3 A C -4 0.333333
3 4 A C -3 0.333333
4 5 A C -2 0.333333
5 6 A E -6 0.750000
6 7 A E -3 0.750000
7 8 A E -4 0.750000
Upvotes: 3