Max Payne
Max Payne

Reputation: 389

How to Create a custom python chart using matplotlib

I am trying to do a highly customized chart using Matplotlib (open to use any other library).

The data that I have looks like this

ItemID | ItemPhase | ItemStatus | ItemOutcome |  Date      
  1        Phase1     Complete       In         01-02-2011
  2        Phase2       WIP          WIP        01-03-2014
  3        Phase1     Complete       Out        05-02-2010
  4        Phase3       WIP          WIP        01-04-2015
  5        Phase2     Complete       In         01-05-2012
  6        Phase2       WIP          WIP        01-02-2013
  7        Phase3     Complete       In         01-06-2015
  8        Phase2     Complete       Out        01-07-2013

The idea of the chart is to show progress against the Items that have been complete for each Phase. Every time an item is complete then an outcome is determined, if the item hasn't been completed then there is no outcome.

The date is only useful to get the ItemPhase, based on the Date, the phase is determined.

I would like the chart to look like this: enter image description here

As you can see from the image, the Item Outcomes section is built out of the result of the Items Status section.

I have struggled to start or get an idea on how to build this so any help is very appreciated.

Thanks for your support!

Upvotes: 3

Views: 417

Answers (1)

filippo
filippo

Reputation: 5294

Here's some pointer to get you started.

Let's say you have your dataframe as df and you want to plot the ItemStatus cells for Phase2:

df = df[df['ItemPhase'] == 'Phase2']
total = df['ItemStatus'].count()

First thing you can plot is the dotted bar with the items count. If each status cell is 1.0 high (to be split between Complete and WIP percentages) we can allocate something like 40% (height=1.4) more for the label.

ax.bar(0, 1.4, width=1, edgecolor='black', lw=1, ls='dotted', color="white")

Now let's plot the main part, you want a first bar that goes from 0 to the frequency of ItemStatus == WIP and the second bar that starts from this frequency and goes up to one. You can get each status count with value_counts and divide by total to get percentages.

bottom = 0
for i, s in enumerate(df['ItemStatus'].value_counts().iteritems()):
    label, count = s
    freq = count / float(total)

    r, = ax.bar(0, freq, width=1, bottom=bottom, color=bg[i], edgecolor='black', lw=3)

    ax.text(r.get_x() + r.get_width()/2.,
            r.get_y() + r.get_height()/2.,
            '{}% ({}) Items {}'.format(int(freq * 100), count, label),
            ha="center", va="center", color=fg[i])

    bottom += freq

Now you just need the n Items label. You can use the latest bar plot r coordinates to properly center it.

ax.text(r.get_x() + r.get_width()/2., 1.2,
        '{} Items'.format(total),
        ha="center", va='center', color='black')

And here's what you get

enter image description here

Now you need to:

  • find some nice pandas way to iterate through all the cells
  • use date information for the x axis
  • add the extra labels like Phase x, Item Status, etc

Upvotes: 2

Related Questions