Reputation: 879
I have a dictionary of single column dataframes (Di_G). I would like to use the index of each dataframe in Di_G to look up a value from another dictionary (Di_A). I would then like to divide each value of each single column dataframe in Di_G by its corresponding value from Di_A.
import pandas as pd
# Data
df_1 = pd.DataFrame({'Box' : [1006,1006,1006,1006,1006,1006,1007,1007,1007,1007,1008,1008,1008,1009,1009,1010,1011,1011,1012,1013],
'Item': [ 40, 41, 42, 43, 44, 45, 40, 43, 44, 45, 43, 44, 45, 40, 41, 40, 44, 45, 44, 45]})
df_2 = pd.DataFrame({'Box' : [1006,1007,1008,1009,1010,1011,1012,1013,1014],
'Type': [ 103, 101, 102, 102, 102, 103, 103, 103, 103]})
# Join
df_J = df_1 .set_index('Box') .join(df_2 .set_index('Box', 'outer'))
# Count how many Boxes contain each Item - Count Boxes ( Item )
df_G = df_J.groupby('Item').size()
Di_A = df_G.to_dict()
# Group the Boxes by Type
Ma_G = df_J .groupby('Type')
Di_1 = {}
for name, group in Ma_G:
Di_1[str(name)] = group
# Count how many Boxes of each Type contain each Item - Count Boxes ( Item │ Type )
Di_G = {}
for k in Di_1:
Di_G[k] = Di_1[k].groupby('Item').size()
I tried doing it like this:
# Pr ( Type │ Item ) = Count Boxes ( Item │ Type ) / Count Boxes ( Item )
for k in Di_G:
Di_G[k]['Pr'] = Di_G[k]['0'] / Di_G[k]['Index'].map(Di_A)
I got "KeyError: '0'".
I tried changing the column names in Di_G & Di_A but I had difficulty doing this.
Upvotes: 0
Views: 23
Reputation: 323226
I think you just need transform
df_J.groupby(['Item','Type']).Item.transform('count')/df_J.groupby('Item').Item.transform('count')
Out[298]:
Box
1006 0.250000
1006 0.500000
1006 1.000000
1006 0.333333
1006 0.600000
1006 0.600000
1007 0.250000
1007 0.333333
1007 0.200000
1007 0.200000
1008 0.333333
1008 0.200000
1008 0.200000
1009 0.500000
1009 0.500000
1010 0.500000
1011 0.600000
1011 0.600000
1012 0.600000
1013 0.600000
Name: Item, dtype: float64
Also to better match your expected output
G=df_J.groupby(['Item','Type']).size()
G.div(G.sum(level=0),level=0)
Out[303]:
Item Type
40 101 0.250000
102 0.500000
103 0.250000
41 102 0.500000
103 0.500000
42 103 1.000000
43 101 0.333333
102 0.333333
103 0.333333
44 101 0.200000
102 0.200000
103 0.600000
45 101 0.200000
102 0.200000
103 0.600000
dtype: float64
Upvotes: 1