Reputation: 273
Hi I am trying to make some contingency tables. I want it in a function so I can use it for various columns/dataframes/combinations etc.
current I have a dataframe that looks like this
df = pd.DataFrame(data={'group' : ['A','A','B','B','C','D'],
'class': ['g1','g2','g2','g3','g1','g2'],
'total' : ['0-10','20-30','0-10','30-40','50-60','20-30'],
'sub' : ['1-4', '5-9','10-14', '15-19','1-4','15-19'],
'n': [3,14,12,11,21,9]})
and a function that looks like this
def cts(tabs, df):
out=[]
for col in df.loc[:,df.columns != tabs]:
a = pd.crosstab([df[tabs]], df[col])
out.append(a)
return(out)
cts('group', df)
which works for cross tabulations for one column against the rest. But I want to add two (or more!) levels to the grouping e.g.
pd.crosstab([df['group'], df['class']], df['total'])
where total is cross tabulated against both group and class.
I think the 'tabs' var in the function should be a list of column names, but when i try and make it a list i get errors re invalid syntax. I hope this makes sense.. thank you!
Upvotes: 0
Views: 154
Reputation: 4229
Try:
def cts(tabs, df):
out=[]
cols = [col for col in df.columns if col not in tabs]
for col in df.loc[:,cols]:
a = pd.crosstab([df[tab] for tab in tabs], df[col])
out.append(a)
return(out)
Upvotes: 1