simplymaxim
simplymaxim

Reputation: 39

How to plot value counts for each subset in matplotlib/seaborn?

I am relatively new to matplotlib and there is probably a better way to deal with the problem. I have tried sns.countplot(), which does not have sorting option. So I tried to do it with a bar plot and pandas for counting:

my_data = pd.DataFrame({'actions': ['buy','buy','buy','observe','consult'] , 'places':['NY','AR','AR','NY','AR']})
fig, axs = plt.subplots(1, 2, figsize = (5,7))
axs = axs.ravel()
for place in my_data['places']: 
    x = 0 
    temp_df = my_data[my_data['places'] == place]
    axs[x] = sns.barplot(y=temp_df.actions.value_counts().index, x=temp_df.actions.value_counts().values, color="#43B8E7",orient = 'h')
    axs[x].set_title(place)
    x=+1

where data look like

   actions places
0      buy     NY
1      buy     AR
2      buy     AR
3  observe     NY
4  consult     AR

and the code produces what's below. As you may have assumed, I need to plot NY as well, however, because of subsetting or something missed in the loop it does not work well. How to fix that? I feel that this is the easy one, however, cannot find it.

enter image description here

Upvotes: 1

Views: 1873

Answers (2)

Paul H
Paul H

Reputation: 68256

I would use a facetgrid since you're already using seaborn:

import pandas
import seaborn

axgrid = pandas.DataFrame({
    'actions': ['buy','buy','buy','observe','consult'] ,
    'places':['NY','AR','AR','NY','AR']
}).pipe((seaborn.catplot, 'data'), 
        y="actions", col="places",
        order=['buy', 'consult', 'observe'],
        kind="count"
)

enter image description here

And you get:

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150815

Are you looking for:

(my_data.groupby('places')['actions']
    .value_counts().unstack('places')
    .plot.bar(subplots=True)
)

Or similarly:

(pd.crosstab(my_data['actions'], my_data['places'])
    .plot.bar(subplots=True)
)

Output:

enter image description here


If you want horizontal bars:

(pd.crosstab(my_data['actions'], my_data['places'])
    .plot.barh(subplots=True, layout=[1,2])
)

Output:

enter image description here


Or we can fix your code:

fig, axs = plt.subplots(1, 2, figsize = (5,7))
axs = axs.ravel()
for ax,place in zip(axs,my_data['places'].unique()): 
    temp_df = my_data[my_data['places'] == place].actions.value_counts()
    sns.barplot(y=temp_df.index, x=temp_df, 
                color="#43B8E7", ax=ax, orient = 'h')
    ax.set_title(place)

Output (which isn't very well-aligned IMHO):

enter image description here

Upvotes: 2

Related Questions