Reputation: 61
I am trying to plot a Time Series Graph with Matplotlib. I have a 2 column CSV file, with dates and closing prices of stocks (Dates are given in the format: '31/07/2020'.
First, I parse my dates column to make it a datetime list. Then I plot my data, through the following code:
data = pd.read_csv('data.csv', parse_dates=['DATE'])
date = data['DATE']
stock1 = data['GEN_ELECTRIC']
stock2 = data['NETFLIX']
plt.figure()
plt.plot(date, stock1)
plt.savefig('Wrong.png')
I get the graph called 'Wrong.png' attached. If I leave out the parse_dates, I get a correct graph, called 'Correct.png'.
My question is, the 'Wrong.png' has x-ticks with only years and Python knows its talking about dates. With the correct version, this is not the case. Furthermore, the 'correct' version also shows the month and day, which is not necessary. Does anyone know, why I am getting a wrong graph? How would I go about fixing this?
PS. Don't mind the overlapping x-axis in the correct.png, I can fix that with plt.locator.
Any help would be appreciated. Thanks in advance!
Upvotes: 1
Views: 512
Reputation: 3066
parse_dates
default format is not DD/MM/YYYY as in european style. If your first row is let's say 1/7/2020, it may interpret it as 7 of January. Later on, when encountered on 30/07/2020, parse_dates will encounter a problem and returns an object data type. To correctly use parse_dates, add the dayfirst=True
attribute to read_csv. It's the right way to declare DD/MM format. Then matplotlib should handle the rest fine
Upvotes: 1