GDelavy
GDelavy

Reputation: 1

Pandas create graph with groupby

I have a dataframe containing sentences taken from a chapter of a book, each one annotated with an emotion (Anger, sadness, etc). The result is something like this:

d = {'text': ["aaa", "aaa", "bbb", "aaa", "bbb", "bbb"], 
    'start': [0, 1, 0, 2, 1, 0], 
    'end': [250, 500, 501, 251, 249, 499]},
    'label': ["anger", "sadness", "sadness", "sadness", "anger", "anger"],
    'annotator': [0,1,1,1,0,0],
    'original_data': ["aaaaaa", "bbbbbb", "aaaaaa", "bbbbbb", "aaaaaa", "bbbbbb"],
    'speaker': ["Achiles", "Hektor", "Achiles", "Achiles", "Hektor", "Hektor"],
    'rounded_length': [110, 250, 250, 110, 110, 250]}

df = pd.DataFrame(data=d)

Here's a picture of the actual dataframe if you'd like a better idea of what is looks like

I am trying to create a plot bar where each bar will represent the number of emotion (label) in a paragraph(original_data). So, one bar for Sadness and one bar for Anger, etc.

Here's what I'd like it to look like

The problem is that the line I am using doesn't seem to work:

graph = df.groupby(['original_data', 'rounded_length']).plot(y='label', x='rounded_length')

I'd appreciate any form of help, thank you!

Upvotes: 0

Views: 438

Answers (2)

Tranbi
Tranbi

Reputation: 12701

Based on your picture, it seems that what you need is a histogram. pandas hist can be fed with the by kwarg for this purpose:

df.hist(column='label', by='original_data', sharex=True, sharey=True)

enter image description here

Edit: if you want all histograms on a shared axis, you can use seaborn:

import seaborn as sns
sns.countplot(data=df, x='original_data', hue='label')

enter image description here

Upvotes: 1

creanion
creanion

Reputation: 2743

You could do it with pandas plots in this way:

(df.groupby(["label", "original_data"])
   # We just need the count, so take text to count entries in there
   .text.count()
   # Unstack to make columns out of this
   .unstack("original_data")
   .plot.bar());

Barplot

The idea here is to first look at the groupby operation and make sure you get a dataframe result that will plot well! In this case I used unstack to make columns out of the facet that I wanted to have in different colors (hues).

Personally I think these kinds of plots are easier to create in seaborn.

Upvotes: 0

Related Questions