Harvey
Harvey

Reputation: 339

Matplotlib bar chart from two variable column - Pandas data manipulation (mostly)

NOTE: Solution Posted Below!!!

I have a time-indexed table with a column BLIP that has only two values "XX" and "YY". The goal is to show a count of "XX" and "YY" with "YY" being below the x axis. I'm trying to create the correct data structure from a pandas table using code from Wes McKenney's book on Data Analysis (pg 26 I think):

df = base_df.drop(columns=dropcols).set_index('Created')
group = ['f2','BLIP']
df0 = df_minus.groupby(group)
agg_counts = df0.size().unstack().fillna(0)
indexer = agg_counts.sum(1).argsort()
count_subset = agg_counts.take(indexer).copy()
table = count_subset.groupby('BLIP').resample('MS').count().unstack('BLIP')['BLIP']
chart = table.plot.bar(title = chart_title, x=None, color = ['green', 'red', 'grey']);

The line

agg_counts = df0.size().unstack().fillna(0) 

results in the following error:

TypeError: 'numpy.int32' object is not callable

I found this gem of a snippet here, but can't find the documentation to decypher it.

data['values'].plot(kind='bar', color=data.positive.map({True: 'g', False: 'r'}))

This seems like is would be very simple, but I'm quite wrapped about the axle on this.

Target Image

The pandas table format is something like

create_date f1 f2 f3 BLIP f5...
dt_stamp    X  Y  Z  XX   K1
dt_stamp    S  R  Y  YY   K3
dt_stamp    P  P  T  XX   K1

and so on.

Per Jesse's suggestion I tried

df_plus =df[df['BLIP']=='XX']
df_minus=df[df['BLIP']=='YY']

ax = plt.axes()
ax.bar(df_plus.index, df_plus['BLIP'], width=0.4, color='g')
ax.bar(df_neg.index, df_minus['BLIP'], width=0.4, color='r')
ax.autoscale()
plt.show()

This resulted in

ValueError: shape mismatch: objects cannot be broadcast to a single shape

Solution in its entirety

df = base_df
plt.clf()
fig = plt.figure()
width = 8
height = 6
fig.set_size_inches(width, height)
chart_title = 'YTD CR Trend Summary'
df_plus =df[df['BLIP'] == 'XX']
df_minus=df[df['BLIP']== 'IYY']
p =  df_plus.resample('MS').count()['BLIP'].fillna(0)
n = df_minus.resample('MS').count()['BLIP'].apply(lambda x: int(-x)) 
print(chart_title, len(df), p.sum(), n.sum())
plt.clf()
fig = plt.figure()
# ax = fig.add_subplot(1, 1, 1)
ax = plt.axes(label=chart_title) #label suppresses warning
if p.sum() != False:
    ax.bar(p.index, p, width=10, color='g') 
if n.sum() != False:
    ax.bar(n.index, n, width=10, color='r')
plt.suptitle(chart_title, fontsize=11)
filename = f'{graph_images_dir}{chart_title}.png'
print(f'Saving {filename}')
plt.savefig(filename,  bbox_inches='tight', pad_inches=0.5, dpi=200)
plt.show()

Upvotes: 0

Views: 1198

Answers (1)

Jesse Bakker
Jesse Bakker

Reputation: 2623

You can plot it manually using matplotlib:

import matplotlib.pyplot as plt

ax = plt.axes()
ax.bar(table.index, table['XX'], width=0.4, color='g')
ax.bar(table.index, table['YY'], width=0.4, color='r')

Upvotes: 1

Related Questions