Reputation: 3577
I have 2 csv files in a folder which look like this:
(file1)
Count Bins
0 -0.322392
1 -0.319392
1 -0.316392
0 -0.313392
2 -0.310392
1 -0.307392
5 -0.304392
4 -0.301392
(file 2)
Count Bins
5 -0.322392
1 -0.319392
1 -0.316392
6 -0.313392
2 -0.310392
1 -0.307392
2 -0.304392
4 -0.301392
and I want to make a line graph with the Bins
on the x-axis vs. the Count
on the y-axis. So there would only be one line in each graph. I am using this code so far:
import pandas as pd
import os
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
#path where csv files are stored
pth = (r'F:\Sheyenne\Statistics\IDL_stats\NDII-2')
#initiate loop
for f in os.listdir(pth):
if not os.path.isfile(os.path.join(pth,f)):
continue
#read each file
df = pd.read_csv(os.path.join(pth, f))
#add column names
df.columns=['Count', 'Bins']
#create pdf file to save graphs to
with PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Delete.pdf') as pdf:
#plot the graph
df2=df.plot(title=str(f))
#set x-label
df2.set_xlabel("Bins")
#set y-label
df2.set_ylabel("Count")
#save the figure
pdf.savefig(df2)
#close the figure
plt.close(df2)
print "Done Processing"
But this graphs two lines, one for Count
and one for Bins
. It also only graphs the first file and not the second returning the error:
Traceback (most recent call last):
File "<ipython-input-5-b86bf00675fa>", line 1, in <module>
runfile('F:/python codes/IDL_histograms.py', wdir='F:/python codes')
File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "F:/python codes/IDL_histograms.py", line 26, in <module>
pdf.savefig(df2)
File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\matplotlib\backends\backend_pdf.py", line 2438, in savefig
raise ValueError("No such figure: " + repr(figure))
ValueError: No such figure: <matplotlib.axes._subplots.AxesSubplot object at 0x0D628FB0>
Upvotes: 1
Views: 2191
Reputation: 352
Pandas DataFrame.plot() returns a matplotlib axis object, but savefig needs a fig object. Get the current matplotlib figure with plt.gcf() and save that.
# Open the pdf before looping to add pages
with PdfPages(r'C:\test\Delete.pdf') as pdf:
for f in os.listdir(pth):
if not os.path.isfile(os.path.join(pth,f)):
continue
# ignore the pdf file that just got created
if 'pdf' in f:
continue
#read each file
df = pd.read_csv(os.path.join(pth, f))
#add column names
df.columns=['Count', 'Bins']
#create pdf file to save graphs to
#plot the graph
df2=df.plot(title=str(f))
#set x-label
df2.set_xlabel("Bins")
#set y-label
df2.set_ylabel("Count")
#save the figure
fig = plt.gcf()
pdf.savefig(fig)
#close the figure
plt.close(fig)
Works for me.
Upvotes: 7
Reputation: 114230
Instead of df2=df.plot(title=str(f))
, which plots everything in your dataframe separately, try df2=df.plot(x='Bins', y='Count', title=str(f))
Upvotes: 0