Stefano Potter
Stefano Potter

Reputation: 3577

Plotting graphs within a loop

I have 2 csv files in a folder which look like this:

(file1)

Count      Bins
0       -0.322392
1       -0.319392
1       -0.316392
0       -0.313392
2       -0.310392
1       -0.307392
5       -0.304392
4       -0.301392

(file 2)

Count      Bins
5       -0.322392
1       -0.319392
1       -0.316392
6       -0.313392
2       -0.310392
1       -0.307392
2       -0.304392
4       -0.301392

and I want to make a line graph with the Bins on the x-axis vs. the Count on the y-axis. So there would only be one line in each graph. I am using this code so far:

import pandas as pd
import os
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

#path where csv files are stored
pth = (r'F:\Sheyenne\Statistics\IDL_stats\NDII-2')

#initiate loop
for f in os.listdir(pth):
    if not os.path.isfile(os.path.join(pth,f)):
        continue
    #read each file
    df = pd.read_csv(os.path.join(pth, f))
    #add column names
    df.columns=['Count', 'Bins']
    #create pdf file to save graphs to
    with PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Delete.pdf') as pdf:
         #plot the graph
         df2=df.plot(title=str(f))
         #set x-label
         df2.set_xlabel("Bins")
         #set y-label
         df2.set_ylabel("Count")
         #save the figure
         pdf.savefig(df2)
         #close the figure
         plt.close(df2)
print "Done Processing"  

But this graphs two lines, one for Count and one for Bins. It also only graphs the first file and not the second returning the error:

Traceback (most recent call last):

  File "<ipython-input-5-b86bf00675fa>", line 1, in <module>
    runfile('F:/python codes/IDL_histograms.py', wdir='F:/python codes')

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
    execfile(filename, namespace)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "F:/python codes/IDL_histograms.py", line 26, in <module>
    pdf.savefig(df2)

  File "C:\Users\spotter\AppData\Local\Continuum\Anaconda\lib\site-packages\matplotlib\backends\backend_pdf.py", line 2438, in savefig
    raise ValueError("No such figure: " + repr(figure))

ValueError: No such figure: <matplotlib.axes._subplots.AxesSubplot object at 0x0D628FB0>

Upvotes: 1

Views: 2191

Answers (2)

Dval
Dval

Reputation: 352

Pandas DataFrame.plot() returns a matplotlib axis object, but savefig needs a fig object. Get the current matplotlib figure with plt.gcf() and save that.

# Open the pdf before looping to add pages
with PdfPages(r'C:\test\Delete.pdf') as pdf:
    for f in os.listdir(pth):
        if not os.path.isfile(os.path.join(pth,f)):
            continue
        # ignore the pdf file that just got created
        if 'pdf' in f:
            continue
        #read each file
        df = pd.read_csv(os.path.join(pth, f))
        #add column names
        df.columns=['Count', 'Bins']
        #create pdf file to save graphs to
        #plot the graph
        df2=df.plot(title=str(f))
        #set x-label
        df2.set_xlabel("Bins")
        #set y-label
        df2.set_ylabel("Count")
        #save the figure
        fig = plt.gcf()
        pdf.savefig(fig)
        #close the figure
        plt.close(fig)

Works for me.

Upvotes: 7

Mad Physicist
Mad Physicist

Reputation: 114230

Instead of df2=df.plot(title=str(f)), which plots everything in your dataframe separately, try df2=df.plot(x='Bins', y='Count', title=str(f))

Upvotes: 0

Related Questions