Reputation: 65
I'm looking to plot my dataframe, which contains many columns each with a "TRUE" or "FALSE" label (imported from Excel).
A small example of something similar would be this:
df = pd.DataFrame({"a":["TRUE","FALSE","FALSE","TRUE","FALSE"],
"b":["TRUE","TRUE","FALSE","TRUE","TRUE"],
"c":["TRUE","FALSE","FALSE","FALSE","TRUE"],
"d":["FALSE","FALSE","TRUE","TRUE","FALSE"]})
I'm looking for a way to concisely summarize how the TRUE and FALSE values are distributed between the columns. Ideally, a graph like something below would be created:
but I'm unsure how to create this. I've tried list comprehension, like trying
sns.barplot([list(df[i].value_counts()) for i in df.columns])
but get something entirely different. I don't even need to know how to make the legend, I just included it in the example to hopefully better portray what I'm getting after.
Upvotes: 1
Views: 56
Reputation: 51165
You had the more efficient approach already in your attempt. You should be computing the value_counts
on each series if you want to scale this to larger frames. You just need a change to the plot
.
f = pd.concat(
[df[s].value_counts() for s in df], axis=1, sort=False)
f.plot(kind='bar')
Upvotes: 2
Reputation: 150725
Here you go:
df.stack().groupby(level=1).value_counts().unstack(0).plot.bar()
Output:
Upvotes: 2