Mykola Zotko
Mykola Zotko

Reputation: 17834

Crosstab with the same variables in rows and columns

I have the following dataframe:

       A      B     C
0   True  False  True
1  False   True  True
2   True   True  True
3   True  False  True

I want to find the number of each combination for 'A', 'B' and 'C'. For example, if I have True for 'A' and 'C' in the first, third and fourth rows the number is equal to 3.

Expected output:

   A  B  C
A  3  1  3
B  1  2  2
C  3  2  4

I don't have any idea how I can achieve this with Pandas. Maybe you can also tell me if this crosstab has a special name.

Upvotes: 1

Views: 361

Answers (2)

Georgina Skibinski
Georgina Skibinski

Reputation: 13387

To add to @Andy L.'s answer- you don't have to convert dataframe to numpy:

df=df.astype(int)
res=df.T@df

Outputs:

   A  B  C
A  3  1  3
B  1  2  2
C  3  2  4

Upvotes: 4

Andy L.
Andy L.

Reputation: 25249

Try numpy. It is np.inner

arr = df.astype(int).T.to_numpy()
np.inner(arr, arr)

Out[1158]:
array([[3, 1, 3],
       [1, 2, 2],
       [3, 2, 4]])

df_final = pd.DataFrame(np.inner(arr, arr), columns=df.columns, index=df.columns)

Out[1160]:
   A  B  C
A  3  1  3
B  1  2  2
C  3  2  4

Upvotes: 4

Related Questions