SSS
SSS

Reputation: 671

Smart way of creating multiple graphs using matplotlib

I have an excel worksheet, let us say its name is 'ws_actual'. The data looks as below.

Project Name    Date Paid   Actuals Item Amount Cumulative Sum
A   2016-04-10 00:00:00 124.2   124.2
A   2016-04-27 00:00:00 2727.5  2851.7
A   2016-05-11 00:00:00 2123.58 4975.28
A   2016-05-24 00:00:00 2500    7475.28
A   2016-07-07 00:00:00 38374.6 45849.88
A   2016-08-12 00:00:00 2988.14 48838.02
A   2016-09-02 00:00:00 23068   71906.02
A   2016-10-31 00:00:00 570.78  72476.8
A   2016-11-09 00:00:00 10885.75    83362.55
A   2016-12-08 00:00:00 28302.95    111665.5
A   2017-01-19 00:00:00 4354.3  116019.8
A   2017-02-28 00:00:00 3469.77 119489.57
A   2017-03-29 00:00:00 267.75  119757.32
B   2015-04-27 00:00:00 2969.93 2969.93
B   2015-06-02 00:00:00 118.8   3088.73
B   2015-06-18 00:00:00 2640    5728.73
B   2015-06-26 00:00:00 105.6   5834.33
B   2015-09-03 00:00:00 11879.7 17714.03
B   2015-10-22 00:00:00 5303.44 23017.47
B   2015-11-08 00:00:00 52000   75017.47
B   2015-11-25 00:00:00 2704.13 77721.6
B   2016-03-09 00:00:00 59752.85    137474.45
B   2016-03-13 00:00:00 512.73  137987.18
.
.
.

Let us say there are many many more projects including A and B with Date Paid and Amount information. I would like to create a plot by project where x axis is 'Date Paid' and y axis is 'Cumulative Sum', but when I just implement the following code, it just combines every project and plot every 'Cumulative Sum' at one graph. I wonder if I need to divide the table by project, save each, and then bring one by one to plot the graph. It is a lot of work, so I am wondering if there is a smarter way to do so. Please help me, genius.

import pandas as pd
import matplotlib.pyplot as plt

ws_actual = pd.read_excel(actual_file[0], sheet_name=0)
ax = ws_actual.plot(x='Date Paid', y='Cumulative Sum', color='g')

Upvotes: 0

Views: 61

Answers (2)

ALollz
ALollz

Reputation: 59579

Right now you are connecting all of the points, regardless of group. A simple loop will work here allowing you to group the DataFrame and then plot each group as a separate curve. If you want you can define your own colorcycle if you have a lot of groups, so that colors do not repeat.

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(8,8))
for id, gp in ws_actual.groupby('Project Name'):
    gp.plot(x='Date Paid', y='Cumulative Sum', ax=ax, label=id)

plt.show()

enter image description here

Upvotes: 2

it's-yer-boy-chet
it's-yer-boy-chet

Reputation: 2017

You could just iterate the projects:

for proj in ws_actual['Project'].unique():
    ws_actual[ws_actual['Project'] == proj].plot(x='Date Paid', y='Cumulative Sum', color='g')
    plt.show()

Or check out seaborn for an easy way to make a facet grid for which you can set a rows variable. Something along the lines of:

import seaborn as sns

g = sns.FacetGrid(ws_actual, row="Project")
g = g.map(plt.scatter, "Date Paid", "Cumulative Sum", edgecolor="w")

Upvotes: 2

Related Questions