Reputation: 1801
I am trying to plot a simple time-series. Here's my code:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
%matplotlib inline
df = pd.read_csv("sample.csv", parse_dates=['t'])
df[['sq', 'iq', 'rq']] = df[['sq', 'iq', 'rq']].apply(pd.to_numeric, errors='coerce')
df = df.fillna(0)
df.set_index('t')
This is part of the output:
df[['t','sq']].plot()
plt.show()
As you can see, the x-axis in the plot above is not the dates I intended it to show. When I change the plotting call as below, I get the following gibberish plot, although the x-axis is now correct.
df[['t','sq']].plot(x = 't')
plt.show()
Any tips on what I am doing wrong? Please comment and let me know if you need more information about the problem. Thanks in advance.
Upvotes: 1
Views: 80
Reputation: 4004
I think your problem is that although you have parsed the t column it is not of type date-time. Try the following:
# Set t to date-time and then to index
df['t'] = pd.to_datetime(df['t'])
df.set_index('t', inplace=True)
Reading you comment and the answer you have added someone may conclude that this kind of problem can only be solved by specifying a parser in pd.read_csv(). So here is proof that my solution works in principle. Looking at what you have posted as a question, the other problem with you code is the way you have specified the plot command. Once t has become an index, you only need to select columns other than t for the plot command.
import pandas as pd
import matplotlib.pyplot as plt
# Read data from file
df = pd.read_csv('C:\\datetime.csv', parse_dates=['Date'])
# Convert Date to date-time and set as index
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df.plot(marker='D')
plt.xlabel('Date')
plt.ylabel('Number of Visitors')
plt.show()
df
Out[37]:
Date Adults Children Seniors
0 2018-01-05 309 240 296
1 2018-01-06 261 296 308
2 2018-01-07 273 249 338
3 2018-01-08 311 250 244
4 2018-01-08 272 234 307
df
Out[39]:
Adults Children Seniors
Date
2018-01-05 309 240 296
2018-01-06 261 296 308
2018-01-07 273 249 338
2018-01-08 311 250 244
2018-01-08 272 234 307
Upvotes: 1
Reputation: 1801
The issue turned out to be incorrect parsing of dates, as pointed out in an answer above. However, the solution for it was to pass a date_parser
to the read_csv
method call:
from datetime import datetime as dt
dtm = lambda x: dt.strptime(str(x), "%Y-%m-%d")
df = pd.read_csv("sample.csv", parse_dates=['t'], infer_datetime_format = True, date_parser= dtm)
Upvotes: 0