Simd
Simd

Reputation: 21223

How to draw a graphical count table in pandas

I have a dataframe df with two columns customer1 and customer2 which are string valued. I would like to make a square graphical representation of the count number for each pair from those two columns.

I can do

df[['customer1', 'customer2']].value_counts()

which will give me the counts. But how can I make something that looks a little like:

enter image description here

from the result?

I can't provide my real dataset but here is a toy example with three labels in csv.

customer1,customer2
a,b
a,c
a,c
b,a
b,c
b,c
c,c
a,a
b,c
b,c

Upvotes: 3

Views: 1341

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

UPDATE:

Is it possible to sort the rows/columns so the highest count rows are at the top ? In this case the order would be b,a,c

IIUC you can do it this way (where ):

In [80]: x = df.pivot_table(index='customer1',columns='customer2',aggfunc='size',fill_value=0)

In [81]: idx = x.max(axis=1).sort_values(ascending=0).index

In [82]: idx
Out[82]: Index(['b', 'a', 'c'], dtype='object', name='customer1')

In [87]: sns.heatmap(x[idx].reindex(idx), annot=True)
Out[87]: <matplotlib.axes._subplots.AxesSubplot at 0x9ee3f98>

enter image description here

OLD answer:

you can use heatmap() method from seaborn module:

In [42]: import seaborn as sns

In [43]: df
Out[43]:
  customer1 customer2
0         a         b
1         a         c
2         a         c
3         b         a
4         b         c
5         b         c
6         c         c
7         a         a
8         b         c
9         b         c

In [44]: x = df.pivot_table(index='customer1',columns='customer2',aggfunc='size',fill_value=0)

In [45]: x
Out[45]:
customer2  a  b  c
customer1
a          1  1  2
b          1  0  4
c          0  0  1

In [46]: sns.heatmap(x)
Out[46]: <matplotlib.axes._subplots.AxesSubplot at 0xb150b70>

enter image description here

or with annotations:

In [48]: sns.heatmap(x, annot=True)
Out[48]: <matplotlib.axes._subplots.AxesSubplot at 0xc596d68>

enter image description here

Upvotes: 2

ode2k
ode2k

Reputation: 2723

As @MaxU mentioned, seaborn.heatmap should work. It appears that you can use the Pandas DataFrame as the input.

seaborn.heatmap(data, vmin=None, vmax=None, cmap=None, center=None, robust=False, annot=None, fmt='.2g', annot_kws=None, linewidths=0, linecolor='white', cbar=True, cbar_kws=None, cbar_ax=None, square=False, ax=None, xticklabels=True, yticklabels=True, mask=None, **kwargs)

https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.heatmap.html#seaborn.heatmap

Upvotes: 0

Related Questions