Reputation: 3577
I have a df like so:
Year Grass Crop Forest Ecoregion CRP
1993 30.41268857 68.45446692 0.255632102 46e 0508common
2001 47.29988968 47.68577796 0.509939614 46e 0508common
2006 71.37357063 20.40485399 0.908684114 46e 0508common
1993 27.17246635 71.97582809 0.12611897 46k 0508common
2001 65.74087991 30.61323084 0.1229253 46k 0508common
2006 81.763099 12.4386173 0.180860941 46k 0508common
1993 30.83567893 68.14034747 0.737649228 46e 05f08
2001 59.45355722 35.68378142 0.354265748 46e 05f08
2006 64.98592643 28.61787829 0.339706881 46e 05f08
1993 28.38187702 71.40776699 0.080906149 46k 05f08
2001 81.90938511 15.4368932 0.118662352 46k 05f08
2006 86.3214671 9.207119741 0.172599784 46k 05f08
1993 18.46387279 80.77686402 0.270081631 46e 05f97
2001 41.23923454 53.1703113 0.605111585 46e 05f97
2006 65.30004066 25.45626696 0.989918731 46e 05f97
1993 20.34764075 78.68863002 0.218653535 46k 05f97
2001 55.42761042 39.96085063 0.191151874 46k 05f97
2006 76.34526161 16.53176535 0.246221691 46k 05f97
and I want to create graphs based a groupby on Ecoregion
. Then within each Ecoregion
I want to graph based on unique CRP
So each unique Ecoregion
will get its own pdf file and then within that file will be graphs based on CRP
. In this case Ecoregion
46e
will have three graphs (0508common
, 05f08
and 05f97
) and Ecoregion
46k
will also have three graphs.
I am trying the following code:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import os
df=pd.read_csv(r'C:\pathway_to_file.csv')
group=df.groupby(['Ecoregion'])
pdf_files = {}
out=r'C:\output_location'
for ecoregion, outer_group in df.groupby(['Ecoregion']):
with PdfPages(os.path.join(out,ecoregion + '.pdf')) as pdf:
for crp, inner_group in outer_group.groupby(['CRP']):
title=crp + '_' + ecoregion
lu_colors=(['g','y','b','r', 'k'])
plot=group.plot(x=['Year'], y=['Grass', 'Crop', 'Forest'],kind='bar', colors=lu_colors, title=title).get_figure()
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
plt.xticks(rotation=70)
plt.set_xlabel('Year')
plt.set_ylabel('Percent')
pdf.savefig(plot)
plt.close(plot)
but this doesn't work properly, the graphs aren't even bar graphs like I want them to be.
An example of how to get one individual graph to be like I want it to is with this, but this doesn't use the groupby like I want it to:
with PdfPages(r'G:\graphs.pdf') as pdf:
lu_colors=(['g','y','b','r', 'k'])
ax=df.set_index('Year').plot(title='0508common_46e', kind='bar', colors=lu_colors)
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
plt.xticks(rotation=70)
ax.set_xlabel('Year')
ax.set_ylabel('Percent')
fig=plt.gcf()
pdf.savefig(fig)
plt.close(fig)
in this case the df would be:
Year Grass Crop Forest Ecoregion CRP
1993 30.41268857 68.45446692 0.255632102 46e 0508common
2001 47.29988968 47.68577796 0.509939614 46e 0508common
2006 71.37357063 20.40485399 0.908684114 46e 0508common
Upvotes: 1
Views: 102
Reputation: 36715
You had a mistake in plot. You have to plot inner group (igr
) not outer. I had slightly changed your code to be more smoothly:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
import os
lu_colors=(['g','y','b','r','k'])
df = pd.read_csv('1.csv', header=0, usecols = [0,1,2,3,4,5])
for ecor, ogr in df.groupby(['Ecoregion']):
with PdfPages(os.path.join("./pdf", ecor.strip()+'.pdf')) as pdf:
for crp, igr in ogr.groupby(['CRP']):
title = crp.strip() + '_' + ecor.strip()
plot = igr.plot(x=['Year'], y=['Grass', 'Crop', 'Forest'], kind='bar', colors=lu_colors, title=title).get_figure()
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
plt.xticks(rotation=70)
ax = plt.gca()
ax.set_xlabel('Year')
ax.set_ylabel('Percent')
pdf.savefig(plot, bbox_inches='tight', pad_inches=0)
plt.close(plot)
Upvotes: 2