Reputation: 503
I want a grouped bar chart, but the default plot doesn't have the groupings the way I'd like, and I'm struggling to get them rearranged properly.
The dataframe looks like this:
user year cat1 cat2 cat3 cat4 cat5 0 Brad 2014 309 186 119 702 73 1 Brad 2015 280 177 100 625 75 2 Brad 2016 306 148 127 671 74 3 Brian 2014 298 182 131 702 73 4 Brian 2015 295 125 117 607 76 5 Brian 2016 298 137 97 596 75 6 Chris 2014 309 171 111 654 72 7 Chris 2015 251 146 105 559 76 8 Chris 2016 231 130 105 526 75 etc
Elsewhere, the code produces two variables, user1 and user2. I want to produce a bar chart that compares the numbers for those two users over time in cat1, cat2, and cat3. So for example if user1 and user2 were Brian and Chris, I would want a chart that looks something like this:
On an aesthetic note: I'd prefer the year labels be vertical text or a font size that fits on a single line, but it's really the dataframe pivot that's confusing me at the moment.
Upvotes: 1
Views: 598
Reputation: 29711
Select the subset of users you want to plot against. Use pivot_table
later to transform the DF
to the required format to be plotted by transposing and unstacking it.
import matplotlib.pyplot as plt
def select_user_plot(user_1, user_2, cats, frame, idx, col):
frame = frame[(frame[idx[0]] == user_1)|(frame[idx[0]] == user_2)]
frame_pivot = frame.pivot_table(index=idx, columns=col, values=cats).T.unstack()
frame_pivot.plot.bar(legend=True, cmap=plt.get_cmap('RdYlGn'), figsize=(8,8), rot=0)
Finally,
Choose the users and categories to be included in the bar plot.
user_1 = 'Brian'
user_2 = 'Chris'
cats = ['cat1', 'cat2', 'cat3']
select_user_plot(user_1, user_2, cats, frame=df, idx=['user'], col=['year'])
Note: This gives close to the plot that the OP had posted.(Year appearing as Legends instead of the tick labels)
Upvotes: 2