Reputation: 99
Let's say i have a dataframe:
df = pd.DataFrame(np.random.randint(0,5, size=(5,6)), columns=list('ABCDEF'))
Crossing variables with pd.crosstab
is simple enough:
table = pd.crosstab(index=df['A'], columns=df['B'])
Yields:
B 1 2 3 4
A
0 1 0 0 0
1 0 0 0 1
2 0 1 1 0
3 0 1 0 0
Where I would for example want a table like this:
B (1+2+3) 1 2 3 4
A
0 1 1 0 0 0
1 0 0 0 0 1
2 2 0 1 1 0
3 1 0 1 0 0
Can anyone set me on the right track here?
Upvotes: 2
Views: 3045
Reputation: 862741
Use sum
with subset, but if use small random df there can be problem you get always another values so values of columns will be different. If use np.random.seed(100)
get same test output as my answer.
table['(1+2+3)'] = table[[1,2,3]].sum(axis=1)
Sample:
np.random.seed(100)
df = pd.DataFrame(np.random.randint(0,5, size=(5,6)), columns=list('ABCDEF'))
table = pd.crosstab(index=df['A'], columns=df['B'])
table['(1+2+3)'] = table[[1,2,3]].sum(axis=1)
print (table)
B 0 1 2 3 4 (1+2+3)
A
0 1 0 0 0 1 0
1 0 0 0 1 0 1
2 0 0 1 0 0 1
3 0 1 0 0 0 1
Upvotes: 2