Reputation: 3577
I have a dataframe like this:
NDVI Value Allotment Date
0 0 0.208430 Arnstson 19840517
1 0 0.211430 Arnstson 19840517
2 0 0.214430 Arnstson 19840517
3 2 0.217430 Arnstson 19840517
4 4 0.220430 Arnstson 19840517
5 1 0.223430 Arnstson 19840517
6 6 0.226430 Arnstson 19840517
7 1 0.229430 Arnstson 19840517
8 11 0.232430 Arnstson 19840517
9 13 0.235430 Arnstson 19840517
10 17 0.238430 Arnstson 19840517
11 9 0.241430 Arnstson 19840517
12 9 0.244430 Arnstson 19840517
13 7 0.247430 Arnstson 19840517
14 22 0.250430 Woodlot 19840517
15 17 0.253430 Woodlot 19840517
16 14 0.256430 Woodlot 19840517
17 5 0.259430 Woodlot 19840517
18 14 0.262430 Woodlot 19840517
19 19 0.265430 Woodlot 19840517
20 10 0.268430 Woodlot 19840517
21 11 0.271430 Arnstson 19840518
22 10 0.274430 Arnstson 19840518
23 9 0.277430 Arnstson 19840518
24 9 0.280430 Arnstson 19840518
25 5 0.283430 Woodlot 19840518
26 7 0.286430 Woodlot 19840518
27 1 0.289430 Woodlot 19840518
28 11 0.292430 Woodlot 19840518
29 6 0.295430 Woodlot 19840518
and I want to create plots based on the Allotment
that are sent to different pdf files. So I want all plots which contain unique Allotment
names sent to one file which plot NDVI
vs. Value
for each respective Date
. I can do this easily for an individual Allotment
with this code:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
group=df.groupby(['Allotment'])
Arnstson=group.get_group('Arnstson')
with PdfPages(r'C:\delete.pdf') as pdf:
for i, group in Arnstson.groupby(['Allotment', 'Date']):
plot=group.plot(x='Value', y='NDVI', title=str(i)).get_figure()
pdf.savefig(plot)
plt.close(plot)
but I have 53 unique names in Allotment
and would prefer not to have to select all of them individually.
Upvotes: 0
Views: 175
Reputation: 2803
One strategy is to open all of the PDF files, write to the appropriate ones, and then close them. Here I use a dictionary to track the file handles where Allotment
is the key. I write them all then close all file handles in a separate step.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
pdf_files = {}
for group_name, group in df.groupby(['Allotment', 'Date']):
allotment, date = group_name
if allotment not in pdf_files:
pdf_files[allotment] = PdfPages('C:\\' + allotment + '.pdf')
plot=group.plot(x='Value', y='NDVI', title=str(group_name)).get_figure()
pdf_files[allotment].savefig(plot)
plt.close(plot)
for key in pdf_files:
pdf_files[key].close()
An alternative would be to use a nested group by, where the outer groupby used Allotment
and the inner (used on the group from the outer) used Date
. This would allow the file to be open and closed one at a time, and would be better if there were potentially lots of Allotment
s.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
pdf_files = {}
for allotment, outer_group in df.groupby(['Allotment']):
with PdfPages('C:\\' + allotment '.pdf') as pdf:
for date, inner_group in outer_group.groupby(['Date']):
plot=group.plot(x='Value', y='NDVI', title=str(allotment, date)).get_figure()
pdf.savefig(plot)
plt.close(plot)
This version is a little shorter, although it does involve nested loops. I prefer the second as it seem a bit clearer as well.
Upvotes: 1