Reputation: 243
import os
from matplotlib.backends.backend_pdf import PdfPages
import pandas as pd
import matplotlib.pyplot as plt
import datetime as dt
pp = PdfPages('multipage.pdf')
pth = "D:/Technical_Data/"
for fle in os.listdir(pth):
df = pd.read_csv(os.path.join(pth, fle),usecols=(0, 4))
if not df.empty:
df=df.astype(float)
days = df['indx']
value = df['Close']
plt.plot_date(x=days, y=value,fmt="r-")
plt.title(fle)
plt.ylabel("Price")
plt.grid(True)
pp.savefig()
pp.close()
I am iterating through files in directory, and saving all graphs to pdf file. The date is in the following format 20150101
.
But it throws the error:
ValueError: year is out of range
Sample data
indx open High Low Close Volume
20140103 31.9823 32.1511 31.8382 32.1213 2034100
20140103 5.28 5.29 5.26 5.27 10387300
20140103 33.9 34.03 33.77 34 930800
20140103 10.62 10.63 10.51 10.6 2004500
20140103 3.42 3.49 3.42 3.49 3837600
20140103 1.69 1.71 1.685 1.705 6870300
20140103 42.5 43.61 42.3 43.47 255500
Upvotes: 2
Views: 4994
Reputation: 3847
You need to convert df['indx'] to pandas DatetimeIndex
# df=df.astype(float) # do not covert yymmdd to float
days = pd.to_datetime(df['indx'].astype(str), format='%Y%m%d')
plt.plot_date(x=days, y=value, fmt="r-")
Upvotes: 0
Reputation: 3405
The problem is the format of days list. You have to convert those values to DateTime type or float representing days since 0001-01-01 UTC.
From matplotlib.pyplot documentation:
plot_date(x, y, fmt='bo', tz=None, xdate=True, ydate=False, **kwargs)
x and/or y can be a sequence of dates represented as float days since 0001-01-01 UTC.
Upvotes: 1